GitHub - shemul/pandas-multiprocessing: simple demonstration of python pandas multiprocessing

simple multiprocessing implementation in python using pandas dataframe

git clone https://github.com/shemul/pandas-multiprocessing
cd pandas-multiprocessing
pipenv install
pipenv run python main.py --input_csv="./data/users.csv" --output_csv="./output/users.csv" --chunk_size=300 --pool=10

where pool indicates how many process will spawn and chunk_size defines how many rows will be process in every pool

Todos

- update readme.md

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
data		data
tests		tests
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
main.py		main.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simple multiprocessing implementation in python using pandas dataframe

Todos

About

Releases

Packages

Languages

shemul/pandas-multiprocessing

Folders and files

Latest commit

History

Repository files navigation

simple multiprocessing implementation in python using pandas dataframe

Todos

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages