- 7 Top Open Source Datasets to Train Natural Language Processing (NLP) & Text Models - Nov 8, 2021.
With a lot of excitement and research around NLP, there are growing opportunities to apply these technologies to real-world scenarios. It's not trivial to become familiar with NLP and these open-source data sets can help you increase your skills.
Dataset, NLP, Open Source
- How to Build Data Frameworks with Open Source Tools to Enhance Agility and Security - Oct 27, 2021.
Let’s take a look at how to harness open source tools to build your data frameworks.
Data Democratization, Deployment, Open Source, Security
- Introducing PostHog: An open-source product analytics platform - Sep 23, 2021.
PostHog is an open-source product analytics platform that helps you and your product team capture, analyze, and make informed decisions based on user behaviour.
Data Analytics, Data Platform, Open Source, Platform, Tools
- The Machine & Deep Learning Compendium Open Book - Sep 16, 2021.
After years in the making, this extensive and comprehensive ebook resource is now available and open for data scientists and ML engineers. Learn from and contribute to this tome of valuable information to support all your work in data science from engineering to strategy to management.
Deep Learning, ebook, GitHub, Machine Learning, Open Source
- Open Source Datasets for Computer Vision - Aug 18, 2021.
Access to high-quality, noise-free, large-scale datasets is crucial for training complex deep neural network models for computer vision applications. Many open-source datasets are developed for use in image classification, pose estimation, image captioning, autonomous driving, and object segmentation. These datasets must be paired with the appropriate hardware and benchmarking strategies to optimize performance.
Computer Vision, Datasets, Open Source
- Querying the Most Granular Demographics Dataset - Aug 13, 2021.
Having access to broad and detailed population data can potentially offer enormous value to any organization looking to interact with specific demographics. However, access alone is not sufficient without being able to leverage advanced techniques to explore and visualize the data.
Big Data, Data Visualization, Geolocation, Neo4j, Open Source
- Facebook Open Sources a Chatbot That Can Discuss Any Topic - Jul 27, 2021.
The new version expands the capabilities of its predecessor building a much more natural conversational experience.
Chatbot, Facebook, NLP, Open Source
- How to Use Kafka Connect to Create an Open Source Data Pipeline for Processing Real-Time Data - Jul 23, 2021.
This article shows you how to create a real-time data pipeline using only pure open source technologies. These include Kafka Connect, Apache Kafka, Kibana and more.
Data Processing, Kafka, Open Source, Pipeline, Real-time
- 7 Open Source Libraries for Deep Learning Graphs - Jul 15, 2021.
In this article we’ll go through 7 up-and-coming open source libraries for graph deep learning, ranked in order of increasing popularity.
Deep Learning, Graphs, Open Source
- Amazing Low-Code Machine Learning Capabilities with New Ludwig Update - Jun 22, 2021.
Integration with Ray, MLflow and TabNet are among the top features of this release.
Low-Code, Machine Learning, Open Source, Uber
- The 7 Best Open Source AI Libraries You May Not Have Heard Of - Jun 9, 2021.
AI researchers today have many exciting options for working with specialized tools. Although starting original projects from scratch is often not necessary, knowing which existing library to leverage remains a challenge. This list of generally unknown yet awesome, open-source libraries offers an interesting collection to consider for state-of-the-art research that spans from automatic machine learning to differentiable quantum circuits.
AI, Hyperparameter, Julia, Open Source, Probability, Quantum Computing
- 5 Data Science Open-source Projects You Should Consider Contributing to - Jun 7, 2021.
As you prepare to interview for a position in data science or are looking to jump to the next level, now is the time to enhance your skills and your resume with by working on rea, open-source projects. Here, we suggest a great selection of projects you can contribute to and help build something awesome, so, all you need to do choose one and tackle it head on.
Caffe, Data Science, Data Science Skills, GitHub, Google, Machine Learning, Open Source
- Binary Classification with Automated Machine Learning - May 17, 2021.
Check out how to use the open-source MLJAR auto-ML to build accurate models faster.
Automated Machine Learning, AutoML, Classification, Open Source
- Easy AutoML in Python - Apr 1, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Python
- Google’s Model Search is a New Open Source Framework that Uses Neural Networks to Build Neural Networks - Mar 1, 2021.
The new framework brings state-of-the-art neural architecture search methods to TensorFlow.
Automated Machine Learning, AutoML, Google, Neural Networks, Open Source
- Easy, Open-Source AutoML in Python with EvalML - Feb 16, 2021.
We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. EvalML is a library for automated machine learning (AutoML) and model understanding, written in Python.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Python
- Facebook Open Sources ReBeL, a New Reinforcement Learning Agent - Dec 14, 2020.
The new model tries to recreate the reinforcement learning and search methods used by AlphaZero in imperfect information scenarios.
Agents, AI, Facebook, Open Source, Reinforcement Learning
- Facebook Open Sourced New Frameworks to Advance Deep Learning Research - Nov 17, 2020.
Polygames, PyTorch3D and HiPlot are the new additions to Facebook’s open source deep learning stack.
Deep Learning, Facebook, Open Source, PyTorch, Research
- Microsoft and Google Open Sourced These Frameworks Based on Their Work Scaling Deep Learning Training - Nov 2, 2020.
Google and Microsoft have recently released new frameworks for distributed deep learning training.
Deep Learning, Google, Microsoft, Open Source, Scalability, Training
- Uber Open Sources the Third Release of Ludwig, its Code-Free Machine Learning Platform - Oct 13, 2020.
The new release makes Ludwig one of the most complete open source AutoML stacks in the market.
Automated Machine Learning, AutoML, Machine Learning, Open Source, Uber
- Netflix’s Polynote is a New Open Source Framework to Build Better Data Science Notebooks - Aug 5, 2020.
The new notebook environment provides substantial improvements to streamline experimentation in machine learning workflows.
IDE, Jupyter, Netflix, Open Source, Scala
- What I learned from looking at 200 machine learning tools - Jul 21, 2020.
While hundreds of machine learning tools are available today, the ML software landscape may still be underdeveloped with more room to mature. This review considers the state of ML tools, existing challenges, and which frameworks are addressing the future of machine learning software.
Data Science Platform, Data Science Tools, Machine Learning, MLOps, Open Source, Tools
- Lynx Analytics is open-sourcing LynxKite, its Complete Graph Data Science Platform - Jun 25, 2020.
Check out this article for a brief summary on what LynxKite is, where it is coming from and how it can help with your data science projects.
Data Science Platform, Graph Analytics, Open Source
- Uber’s Ludwig is an Open Source Framework for Low-Code Machine Learning - Jun 15, 2020.
The new framework allow developers with minimum experience to create and train machine learning models.
Low-Code, Machine Learning, No-Code, Open Source, Uber
- LinkedIn Open Sources a Small Component to Simplify the TensorFlow-Spark Interoperability - May 25, 2020.
Spark-TFRecord enables the processing of TensorFlow’s TFRecord structures in Apache Spark.
LinkedIn, Open Source, Spark, TensorFlow
- Build and deploy your first machine learning web app - May 22, 2020.
A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret.
App, Flask, Heroku, Machine Learning, Modeling, Open Source, Pipeline, PyCaret, Python
- Google Open Sources SimCLR, A Framework for Self-Supervised and Semi-Supervised Image Training - Apr 27, 2020.
The new framework uses contrastive learning to improve image analysis in unlabeled datasets.
Google, Image Recognition, Open Source, Self-supervised Learning
- Announcing PyCaret 1.0.0 - Apr 21, 2020.
An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.
Machine Learning, Modeling, Open Source, PyCaret, Python
- OpenAI Open Sources Microscope and the Lucid Library to Visualize Neurons in Deep Neural Networks - Apr 17, 2020.
The new tools shows the potential of data visualizations for understanding features in a neural network.
Neural Networks, Open Source, OpenAI, Visualization
- Sharing your machine learning models through a common API - Feb 12, 2020.
DEEPaaS API is a software component developed to expose machine learning models through a REST API. In this article we describe how to do it.
API, Deep Learning, Machine Learning, Open Source, Python
- Google Open Sources MobileNetV3 with New Ideas to Improve Mobile Computer Vision Models - Dec 2, 2019.
The latest release of MobileNets incorporates AutoML and other novel ideas in mobile deep learning.
Automated Machine Learning, Computer Vision, Google, Mobile, Open Source
- Open Source Projects by Google, Uber and Facebook for Data Science and AI - Nov 28, 2019.
Open source is becoming the standard for sharing and improving technology. Some of the largest organizations in the world namely: Google, Facebook and Uber are open sourcing their own technologies that they use in their workflow to the public.
Advice, AI, Data Science, Data Scientist, Data Visualization, Deep Learning, Facebook, Google, Open Source, Python, Uber
- What’s the Best Data Strategy for Enterprises: Build, buy, partner or acquire? - Jul 22, 2019.
Every large organization is investing heavily in building data solutions and tools. They are building data solutions from scratch when they could be taking advantage of readily available tools and solutions. Many organizations are re-inventing the wheel and wasting resources.
Acquisitions, Enterprise, Implementation, Open Source, Strategy
- A comprehensive list of Machine Learning Resources: Open Courses, Textbooks, Tutorials, Cheat Sheets and more - Dec 7, 2018.
A thorough collection of useful resources covering statistics, classic machine learning, deep learning, probability, reinforcement learning, and more.
Cheat Sheet, Data Science Education, Deep Learning, Machine Learning, Mathematics, Open Source, Reinforcement Learning, Resources, Statistics
- Implementing Automated Machine Learning Systems with Open Source Tools - Oct 25, 2018.
What if you want to implement an automated machine learning pipeline of your very own, or automate particular aspects of a machine learning pipeline? Rest assured that there is no need to reinvent any wheels.
Automated Machine Learning, Feature Engineering, Feature Selection, Hyperparameter, Machine Learning, Open Source
- The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data.
Anaconda, Apache Spark, Data Science, Keras, Machine Learning, Open Source, Poll, Python, R, RapidMiner, Scala, scikit-learn, TensorFlow
- Torus for Docker-First Data Science - May 8, 2018.
To help data science teams adopt Docker and apply DevOps best practices to streamline machine learning delivery pipelines, we open-sourced a toolkit based on the popular cookiecutter project structure.
Data Science, DevOps, Docker, Machine Learning Engineer, Open Source, Python
- Top 16 Open Source Deep Learning Libraries and Platforms - Apr 24, 2018.
We bring to you the top 16 open source deep learning libraries and platforms. TensorFlow is out in front as the undisputed number one, with Keras and Caffe completing the top three.
Caffe, GitHub, Keras, Machine Learning, Open Source, TensorFlow
- Top 20 Python AI and Machine Learning Open Source Projects - Feb 20, 2018.
We update the top AI and Machine Learning projects in Python. Tensorflow has moved to the first place with triple-digit growth in contributors. Scikit-learn dropped to 2nd place, but still has a very large base of contributors.
GitHub, Machine Learning, Open Source, Python, scikit-learn, TensorFlow
- Supercharging Visualization with Apache Arrow - Jan 5, 2018.
Interactive visualization of large datasets on the web has traditionally been impractical. Apache Arrow provides a new way to exchange and visualize data at unprecedented speed and scale.
Apache Arrow, Big Data, Data Analytics, Data Visualization, Dremio, GPU, Graphistry, Open Source
- Why Apache Arrow is the future for open source-columnar memory analytics - Aug 7, 2017.
Apache Arrow is a de-facto standard for columnar in-memory analytics. In the coming years we can expect all the big data platforms adopting Apache Arrow as its columnar in-memory layer.
Analytics, Apache, Apache Arrow, Big Data, In-Memory Computing, Open Source
- Visualizing Convolutional Neural Networks with Open-source Picasso - Aug 1, 2017.
Toolkits for standard neural network visualizations exist, along with tools for monitoring the training process, but are often tied to the deep learning framework. Could a general, easy-to-setup tool for generating standard visualizations provide a sanity check on the learning process?
Convolutional Neural Networks, Neural Networks, Open Source, Visualization
- Data Version Control: iterative machine learning - May 11, 2017.
ML modeling is an iterative process and it is extremely important to keep track of all the steps and dependencies between code and data. New open-source tool helps you do that.
CRISP-DM, DVC, GitHub, Machine Learning, Open Source, Reproducibility, Version Control
- Open Source Toolkits for Speech Recognition - Mar 14, 2017.
This article reviews the main options for free speech recognition toolkits that use traditional Hidden Markov Models and n-gram language models.
C++, Java, Open Source, Python, Speech Recognition, SVDS
- Top 20 Python Machine Learning Open Source Projects, updated - Nov 21, 2016.
Open Source is the heart of innovation and rapid evolution of technologies, these days. This article presents you Top 20 Python Machine Learning Open Source Projects of 2016 along with very interesting insights and trends found during the analysis.
GitHub, Machine Learning, Open Source, Python, scikit-learn
- Top Machine Learning Projects for Julia - Aug 19, 2016.
Julia is gaining traction as a legitimate alternative programming language for analytics tasks. Learn more about these 5 machine learning related projects.
Deep Learning, Julia, Machine Learning, Open Source, scikit-learn
- 35 Open Source tools for Internet of Things - Jul 25, 2016.
If you have heard about the Internet of Things many times by now, its time to join the conversation. Explore the many open source tools & projects related to Internet of Things.
Pages: 1 2 3
Internet of Things, IoT, Open Source, Tools
- 5 Machine Learning Projects You Can No Longer Overlook - May 19, 2016.
We all know the big machine learning projects out there: Scikit-learn, TensorFlow, Theano, etc. But what about the smaller niche projects that are actively developed, providing useful services to users? Here are 5 such projects.
Data Cleaning, Deep Learning, Machine Learning, Open Source, Overlook, Pandas, Python, scikit-learn, Theano
- Top 10 Data Science Resources on Github - Mar 24, 2016.
The top 10 data science projects on Github are chiefly composed of a number of tutorials and educational resources for learning and doing data science. Have a look at the resources others are using and learning from.
Coursera, GitHub, IPython, Johns Hopkins, Open Source, Top 10
- Top 10 Data Visualization Projects on Github - Feb 22, 2016.
Github provides a number of open source data visualization options for data scientists and application developers integrating quality visuals. This is a list and description of the top project offerings available, based on the number of stars.
D3.js, Data Visualization, GitHub, Matthew Mayo, Open Source, Top 10
- Opening Up Deep Learning For Everyone - Feb 19, 2016.
Opening deep learning up to everyone is a noble goal. But is it achievable? Should non-programmers and even non-technical people be able to implement deep neural models?
Caffe, Deep Learning, Feature Engineering, Open Source, TensorFlow
- Auto-Scaling scikit-learn with Spark - Feb 11, 2016.
Databricks gives us an overview of the spark-sklearn library, which automatically and seamlessly distributes model tuning on a Spark cluster, without impacting workflow.
Apache Spark, Databricks, Open Source, scikit-learn
- Top 10 Deep Learning Projects on Github - Jan 13, 2016.
The top 10 deep learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.
Caffe, Deep Learning, GitHub, Open Source, Top 10, Tutorials
- Top 10 Machine Learning Projects on Github - Dec 14, 2015.
The top 10 machine learning projects on Github include a number of libraries, frameworks, and education resources. Have a look at the tools others are using, and the resources they are learning from.
Pages: 1 2
GitHub, Machine Learning, Matthew Mayo, Open Source, scikit-learn, Top 10
- Topological Data Analysis – Open Source Implementations - Nov 6, 2015.
Topological Data Analysis (TDA) is making waves in the analytics community lately, but are there open source options available?
C++, Java, Matthew Mayo, Open Source, Python, R, Topological Data Analysis
- Interview: Joseph Babcock, Netflix on Genie, Lipstick, and Other In-house Developed Tools - Jun 16, 2015.
We discuss role of analytics in content acquisition, data architecture at Netflix, organizational structure, and open-source tools from Netflix.
Data Science, ETL, In-house, Interview, Joseph Babcock, Netflix, Open Source, Tools
- Top 20 Python Machine Learning Open Source Projects - Jun 1, 2015.
We examine top Python Machine learning open source projects on Github, both in terms of contributors and commits, and identify most popular and most active ones.
GitHub, Machine Learning, Open Source, Python, scikit-learn
- SlamData Open Source Analytics Tool for MongoDB - Dec 4, 2014.
SlamData is an open source SQL-based tool designed to make accessing data in MongoDB easy for developers and non-developers alike with the goal of making application intelligence easier.
MongoDB, NoSQL, Open Source, SlamData, SQL
- Rattle package for Data Mining and Data Science in R - Sep 17, 2014.
Try the newly-released version of Rattle, the open source R package for data mining, and enjoy accessing a huge array of data mining algorithms through a convenient interface.
Data Mining Software, Free Software, Graham Williams, Open Source, R, Togaware
- Interview: Michael Berthold, President and Founder of KNIME, on Data Mining, Startups, and Visual Workflow - Aug 9, 2014.
We discuss KNIME key features and how it compares to competition, KNIME business model, Pharma, planned development, and transition from an academic project to a company.
Knime, Konstanz University, Michael Berthold, Open Source
- DLib: Library for Machine Learning - Jun 10, 2014.
DLib is an open source C++ library implementing a variety of machine learning algorithms, including classification, regression, clustering, data transformation, and structured prediction.
C++, DLib, Machine Learning, Open Source, Tools
- OpenNN, An Open Source Library For Neural Networks - Jun 2, 2014.
OpenNN is an open source class library written in C++ which implements neural networks, and runs on Windows, Apple, or Linux.
Neural Networks, Open Source, OpenNN
- Big Data Landscape, v 3.0, analyzed - May 15, 2014.
We analyze the Big Data Landscape and identify the most popular market segments in Analytics, Infrastructure, Applications, Open Source, and Data Sources categories. It is still early - only 4.5% of companies had exits.
Big Data, Big Data Analytics, Data Platform, Infrastructure, Landscape, Open Source, Startups
- Open Source Data Science Masters Curriculum - Dec 21, 2013.
A good collection of open source resources for Data Science Masters Curriculum, covering Math, Algorithms, Databases, Data Mining, Machine Learning, Natural Language Processing, Data Analysis and Visualization, and Python.
MS in Data Science, Open Source