Open source Data science libraries – AI, ML, NLP

Best Open source Data science libraries used for AI, ML, NLP.


Scikit-learn  is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, etc. One of the oldest and most widely used libraries.

SciPy is a Python-based ecosystem of open-source software for mathematics, science, and engineering. SciPy builds on the NumPy array object and is part of the NumPy stack which includes tools like Matplotlib, pandas and SymPy, and an expanding set of scientific computing libraries.

NumPy is the fundamental package for scientific computing with Python. It contains among other things a powerful N-dimensional array object and useful linear algebra, Fourier transform, and random number capabilities.

pandas is a powerful, user friendly software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.

The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language.It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries.

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib tries to make easy things easy and hard things possible. You can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc., with just a few lines of code.


Comments are closed.