Although it previously utilized CSV data via structured data for improved search appearances, Google is now indexing CSV files.
Google silently added a statement indicating that they are now indexing.csv files to their Google Search Central documentation.
As a result, there is a new way to be crawled; alternatively, if a publisher doesn’t want their.csv files to be crawled, it may be necessary to update robots.txt to exclude those files
CSV (Comma-Separated Values)
Text files called Comma-Separated Values (CSV) files store data in a tabular format that can be viewed as a spreadsheet.
CSV files don’t contain any other formatting components like fonts, graphics, or clickable links because they solely store plain text data.
They can be used to upload a list of URLs for crawling to programs like Screaming Frog, for example.
They are helpful for arranging data in a spreadsheet, though.
New Is CSV File Indexing
Because a “filetype” search on Google for CSV files does not yet return CSV files, Google’s ability to index CSV files is a new feature.
Google has already utilized CSV files inadvertently.
- filetype:csv site:.gov
- filetype:csv site:.edu
- filetype:csv site:.com
The fact that Google already utilised CSV files
in its Dataset search appearance but only when they were accompanied by structured data is curious about the indexing of CSV files by Google.
CSV files are a valid format for showing up in dataset search features, according to old Google Developer documentation (viewable on Archive.org) on dataset structured data.
Tabular data has been used as a search appearance since Google declared in 2018 that it would display that type of material in search when it is accompanied by structured data.
The original paperwork reads as follows:
When you give supporting information about datasets, such as their name, description, originator, and distribution formats are provided as structured data, it makes finding them easier.
Here are a few instances of what might be considered a dataset:
- a CSV file or table containing data
- A well-organized group of tables
- a data file with a proprietary structure
- a group of files that together make up a valuable dataset
- a structured entity that you would want to import into a specialized tool for processing and include data in some other format
- data collecting from images
- documents pertaining to machine learning, such as trained parameters or definitions of neural networks
- Anything that appears to you to be a dataset.
In 2022, Google updated the aforementioned material and switched users on to the new Search Central material.
The enhanced documentation makes it more clear that Google relies on structured data and uses CSV files for their dataset search appearance.
But would this modification suggest that, in addition to tabular data marked up with structured data, Google will soon crawl CSV files and use those for search appearances?
The existing documentation explains as follows:
When supporting details like a dataset’s name, description, creator, and distribution formats are provided as structured data, finding the datasets becomes easier.
Schema.org and other metadata standards that can be added to pages that describe datasets are used in Google’s approach to dataset discovery.
Here are a few instances of what might be considered a dataset:
An informational table or CSV file…”
Google CSV Indexing in Connection with Recent Updates?
When Google updates its fundamental algorithm, it is seen as having experienced “significant” and “broad changes.”
It’s possible that the almost simultaneous occurrence of the indexing of CSV files and the main algorithm change was a coincidence.
However, it may be worth wondering if Google has enhanced its crawling engine to support indexing CSV or if that functionality was present already.
Read also:Google’s Experimental Feature Makes Webpages Faster
Read the updated list of a indexable file types:
File types indexable by Google
Read Google’s Search Central Dataset Documentation:
Dataset (Dataset, DataCatalog, DataDownload) structured data