Simple Converter for GoogleScholar CSV files to Excel (xslx) using Python


Exporting and handling GoogleScholar CSV exports can drive you mad. I wrote a simple converter script using Python and Pandas to do speed up the task.

import pandas as pd
import glob

# Path
path = '/path/to/look/at'

# Search given path recursively and find files matching the given pattern.
files = [f for f in glob.glob(path + "**/*Google*.csv", recursive=True)]

for filename in files:
    with open(filename, "r", encoding="utf-8") as ins:
        array = []
        for line in ins:

            # Replace quotes
            line = line.replace('"', '')

            # Temporary replace comma-followed-by-whitespace
            # occurences with underscore
            line = line.replace(', ', '_')

            # rsplit line from right for 7 occurences of comma        
            line = [x.strip() for x in line.rsplit(',', 7)]

            # Restore comma-followed-by-whitespace occurences
            line = [x.replace('_', ', ') for x in line]

            # store processed line

    # convert array to pandas dataframe
    df = pd.DataFrame(array)        
    # Rename file
    new_name = filename.replace('.csv', '.xlsx')
    # Store df in Excel format
    df.to_excel(new_name, index=False, header=None)

Current rating: 3.5


There are currently no comments

New Comment


required (not published)