News API v2 launched!

Today I’m excited to announce the launch of News API v2.

It’s taken over 500 git commits and has come in more than 9 months later than planned, but we’re finally ready to show you what we’ve been working on. Who knew that building an accurate, performant, web-scale crawler could be so complex? 😁

We set out to hit two goals with v2:

  • To meet the most requested missing feature: keyword-search ALL the news.
  • To build a foundation for quicker and more efficient development of the service moving forward.

News API was conceived almost 18 months ago with only one thing in mind – to provide breaking news headlines from a range of popular news sources. V1 did exactly this, but it offered basic functionality. We didn’t store news history and there was a cap of 10 articles available for a source at a time. Our database contained no more than 700 articles at once and there were few options to sculpt or filter the results.

We listened to your feedback and realised that the uses for this data were limited but that the potential was great if we set our sights just a little bit higher.

Since Google News API was deprecated back in 2011 no one has stepped in to provide an accessible, simple, and easy-to-use API for searching news all over the web. This was the main takeaway we learned from talking to our most vocal users. This is what drove the development of v2, and structured the first of our goals.

The second goal of v2 has primarily been a behind-the-scenes endeavour. Back in v1 News API tracked only 70 sources. The algorithms for scanning each source were all hard-coded into the core code of the News API application, and while this worked well for getting things off the ground quickly it became a bear to maintain and scale.

We needed to fix this process so that we can scale up to thousands of sources and fix any accuracy issues quickly. To do this we’ve been building a bespoke DSL and engine to parse and monitor article-based websites. There’s still work to be done here but News API is now tracking over 5,000 sources effortlessly and there are tens of thousands more in the pipeline for the coming months. Eventually, I hope we can open-source our work for others to benefit from too.

So that’s where we are. We still have a way to go to fully realise all our ideas and further improve News API for you, and I think we’re making good progress.

Finally, here are some numbers to quantify our journey so far:

  • Months since v1 launch: 18.
  • API keys registered: 43,554.
  • Average API requests/second: 750.
  • New articles indexed yesterday: 132,958.

My inbox is open to feedback and I’d love to hear what you think of the changes, or what features you feel are still missing.

Happy querying! 📰📰📰