EEVblog Electronics Community Forum

Products => Computers => Topic started by: Martin F on February 19, 2020, 02:04:21 pm

Title: Freelance task (dask + s3fs)
Post by: Martin F on February 19, 2020, 02:04:21 pm
Hi all, we're looking for a freelancer to help us with a dask/s3fs project.

Basically, we're looking to process big datasets (time series of vehicle data like speed, RPM, ...) from an S3 bucket. The aim is to process data (e.g. 500 MB across 1-50 files) on e.g. an EC2 instance and parse the resampled data (or summary statistics) for use in e.g. Dash (by plotly).

One complexity in this is that our dataset is a non-standard binary format. We are able to produce an iterator for this, though the backend would require that dask is set up to extract the data through this iterator.

We think it's probably a matter of relatively few lines of code to get this set up - but we're hoping to find somebody with experience to help get this setup in a smart, efficient and scalable way.

If you're interested or know somebody, please reach out to me.