Overview
This article explains how to use the collate_csvs.zsh script to merge multiple CSV files into a single output file. The script also checks for duplicate entries across the files and helps avoid redundancy. This tool is ideal when working with batch CSV exports that need consolidation, such as the requirement from Skullcandy.
Purpose of the Script
The collate_csvs.zsh script performs the following actions:
- Combines all .csv files from a specified folder (csv/) into a single output.csv
- Retains the header only from the first file.
- Identifies and skips duplicate rows.
- Reports which lines are duplicated and from which file.
- Tracks processing time and number of rows added.
How to Use on Windows (via WSL)
Prerequisites:
- Windows Subsystem for Linux (WSL) installed (Ubuntu is recommended).
- The collate_csvs.zsh script. (Check the attachment at the end of this article)
- Your .csv files should be inside a folder named csv, in the same directory as the script.
Steps:
1. Install WSL (if not installed): Open PowerShell as Administrator and run:
wsl --install
2. Reboot if prompted, and follow instructions to complete Ubuntu installation.
3. Access your Desktop through WSL by typing this line on PowerShell:
cd /mnt/c/Users/<YourUsername>/Desktop/<YourFoldername>
4. Check if the script is executable:
chmod +x collate_csvs.zsh
5. Run the script, by typing this line:
bash collate_csvs.zsh
6. Then, you should get an answer similar to this:
7. Then an output file is created in the original folder.
How to Use on macOS
Prerequisites:
- Terminal access.
- zsh installed (comes by default on most recent macOS versions).
Steps:
1. Place your script and a folder called csv (containing your CSV files) in the same directory.
2. Open Terminal and navigate to that directory:
cd ~/Desktop/test # or wherever your script is located
3. Make the script executable:
chmod +x collate_csvs.zsh
4. Run the script:
./collate_csvs.zsh
Output
In both cases, Windows and Mac, an output.csv file is created or overwritten in the same directory. Also, duplicate entries are logged with their line number and source file.
Common Issues
- “No such file or directory” errors: Ensure all .csv files are inside the csv folder.
- Permission errors: Use chmod +x collate_csvs.zsh to grant execution rights.
- Wrong directory: Confirm you’re running the script from the correct folder.
Comments