Data Works MD Logo

About Us

Data Works MD consists of professionals, students, and enthusiasts living and working in the Maryland area that are interested in topics related to data science, data analytics, data products, software engineering, machine learning, and other data engineering topics.

0+ Members
0+ Events
0+ Newsletters


Register for one of our upcoming events!


Recent videos of our events can be found below. More are available at YouTube.


February 24, 2021

ML Design Patterns and Designing ML Infrastructure

Design patterns are formalized best practices to solve common problems when designing a software system. As machine learning moves from being a research discipline to a software one, it is useful to catalog tried-and-proven methods to help engineers tackle frequently occurring problems that crop up during the ML process. In this talk, I will cover five patterns (Workflow Pipelines, Transform, Multimodal Input, Feature Store, Cascade) that are useful in the context of adding flexibility, resilience and reproducibility to ML in production. For data scientists and ML engineers, these patterns provide a way to apply hard-won knowledge from hundreds of ML experts to your own projects.


January 16, 2021

Malware Detection, Enabled by Machine Learning

With the scale of new malware being created each year growing, as well as the expanding market opportunities for malware reuse, protecting systems can’t rely solely on downloading a vendor’s updated virus signature files. Our customers need ways to detect and cordon likely threats, by using data retrieved from a combination of static and behavioral characteristics, and comparing it to other classes of “good” versus “bad” files. Optimally, the solution cordons risky files, force ranks them according to their likelihood of causing harm, correlates some metadata to help with further learning and to provide context to analysts, and lets an analyst “release” a file after further analysis and a request from a user. Oh, with that feedback relayed back into the model to support further tuning.


Novenber 7, 2020

Edge Device Computing for Machine and Deep Learning

Edge computing is a distributed computing model in which computing takes place near the physical location where data is being collected and analyzed, rather than on a centralized server or in the cloud. According to Gartner "91% of today’s data is created and processed in centralized data centers. By 2022 about 75% of all data will need analysis and action at the edge."


October 6, 2020

Wrangling Data and Visualizing Patterns with Python and GIS

A geographic information system (GIS) is a framework for gathering, managing, and analyzing data. Rooted in the science of geography, GIS integrates many types of data. GIS data is used for a variety of purposes including mapping, urban planning, agriculture, and banking. Join us in October to learn how you can use Python to explore, analyze, and work with GIS data.


September 8, 2020

The Business of Data in Maryland with Yet Analytics and Protenus

In partnership with TEDCO, we are featuring two speakers from the successful Maryland-based data-focused companies, Yet Analytics and Protenus. Protenus will be discussing how they built their Protenus Healthcare Compliance Analytics platform with a discussion on Random Forests. Yet Analytics will discuss the unique approach of their xAPI as a specification based in the world of semantic technology and talk about how xAPI is implemented for data simulation, analytics, and advanced visualization and reporting in the learning and training space.


August 15, 2020

Pitfalls and Challenges of ML-Powered Applications

As a field, we often hear about success stories. This is true in research, where a publishing incentive can pressure authors to focus on consistently exceeding state of the art results. It is also true in industry, where companies attempt to attract engineering talent by describing how impressive their production ML systems are. However, every practitioner here knows that in engineering and in ML, the road to success is paved with failures. The field of ML in production is new, and so has a lack of cautionary tales of things that can go wrong with models.


Interesting articles, tools, and tutorials. More are available at our newsletter archive.

Data Works MD Feburary 2021 Issue

Data-driven company, data leakage, sentiment analysis, and top Python libraries, ...

Data Works MD December 2020 Issue

AlphaFold, Netflix, 2020 trends, awesome data engineering, ...

Data Works MD November 2020 Issue

State of AI in 2020, evil data science, data orchestration, and how to win Kaggle competitions, ...


We are proudly supported by the following organizations.