Performance Booster with System.IO.Pipelines in C#

The Challenge

We will be experimenting with reading a large csv file (100,000 records with 5 fields) of employee data. I’m sure you must encounter this challenge many times in your career where you have to parse a large csv file. This challenge creates a enough pressure on GC. The considerations around garbage collection are particularly important when thinking about performance. This is because garbage collection takes up CPU time, reducing the time spent on the actual data processing. Not only this, but each time a garbage collection is triggered, much of the work is suspended so that the remaining references can be evaluated. This can drastically effect the amount of time taken to process the data. I’ve chosen three techniques for this challenge.

  1. Using IAsyncEnumerable : This API was introduced from C#8 where a data stream (chunks of data) can be processed instead of whole file.
  2. Using System.IO.Pipelines : This API was shipped with .NET core 2.1 and is internally used by Kestrel , web server for AspNET core for high performance to process many requests per second received from the socket. It is available as Nuget package download.David Fowler who has architected these APIs as an excellent post of its introduction.

Benchmark Results

I’ve used BenchmarkDotNet library to measure the performance.

Performance Benchmark Results

--

--

The Diligent Geek

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store