Google has recently announced that Dataset Search is now out of beta. It is a service that lets you search for close to 25 million different publicly available data sets. It was first launched in September 2018.
As per the reports, Dataset Search is getting a mobile version and Google is adding a few new features to Dataset Search. The first of these is a new filter that lets you choose which type of data set you to want to see (tables, images, text, etc.), which makes it easier to find the right data you’re looking for. Also, the company has added more information about the data sets and the organizations that publish them.
People using Dataset Search thus far range from academic researchers to students, to business analysts. The most commonly searched for datasets include “education,” “weather,” “cancer,” “crime,” “soccer,” and “dogs”. The largest topics that are covered in datasets include geosciences, biology, and agriculture. The most popular dataset format is tables, with more than 6 million of them included in Dataset Search.
Researchers can use these data sets, which range from pretty small ones that tell you how many dogs there were in the United States from 2010 to 2018 to large annotated audio and image sets. The tool currently indexes about 6 million tablets. A lot of the data in the search index comes from government agencies. Besides, as per Google, there are about 2 million U.S. government data sets in the index right now. However, the user will also regularly find Google’s own Kaggle show up, as well as many other public and private organizations that make public data available, as well.
How To Get Indexed in Dataset Search?
Although its officially out of beta, Google is committed to improving Dataset Search going forward just as it’s always improving its main search engine.
The process publishers have to go through to have their datasets included in Dataset Search remains the same. Anybody who publishes data can make their datasets discoverable by using the appropriate schema.org structured data.
As Google notes, anybody who owns an interesting data set can make it available to be indexed by using a standard schema.org mark-up to describe the data in more detail.