Transfix Tech Talk (Machine Learning)
From Big Data and Machine Learning to Blockchain and AI, the language of technology is quickly integrating itself into the language of business. Have you ever wondered exactly what these words and phrases mean and the role they play in your daily life?
In our on-going Tech Talk series, we’ll be leaning on the expertise of our CTO, Jonathan Salama, to decode the most prevalent tech terms of the day.
What is Machine Learning?
Machine learning is a form of AI that enables a system to learn from data, rather than through programming. Essentially, the process starts by providing a machine learning algorithm with a data set. Once it begins to identify variables that are mathematically significant, it then creates a new algorithm. This is taking place every time we price a shipment, thereby refining and strengthening the accuracy of our algorithm through machine learning.
What role do members of our data team play in machine learning here at Transfix?
For starters, you have to decide which data sets you’re going to feed your algorithm. You can’t simply throw a mountain of data at a machine and expect it to just figure it out. Being selective in the data you choose and how you massage said data is critical, because we’re not in an industry where reliable data is readily available. We’re far from the Facebooks and Googles of the world. They have access to an ocean of third-party data that they can rely upon on a daily basis. This places an added emphasis on our in-house data, which we believe is most accurate and actionable in the industry.
The challenge we face is that with a lower volume of outside data, the human element becomes more important. Which data we choose, how we aggregate the information, and which models we choose to use - these are critical decisions made by members of our data team. To make informed decisions you need a comprehensive understanding of advanced mathematics and statistical modeling. That is one of our strengths - the expertise of our data team.
Have you ever wondered what data sets play a part in Transfix’s best-in-class algorithm? Filip Piasevoli, a Senior Data Scientist, is here to walk you through the DNA of our machine learning-enabled algorithm.
Why is pricing and forecasting a challenging endeavor?
A hallmark of our industry is that there isn't a ton of monthly rate data available. This is pretty different from what you see in industries with a well-defined third-party market of data providers like hedge funds. We use DAT, ITS, and Sonar as a few sources of information. But it’s important to keep in mind that a source like Sonar is a relatively new product. I think we're just starting to see the maturity of the data provider market rise in step with venture capital investment in the space as a whole.
Where Transfix differentiates itself is that we supplement third-party data through dozens of unique channels. For example, we mine data from government indexes such as the CFS (Commodity Flow Survey). Pouring over a variety of government indexes on a monthly basis affords us a granular understanding of the marketplace. As a whole, these data sets guide our opinions of where the market is going.
In addition, we supplement all the third-party data we collect by leaning heavily on our own proprietary data. Historical rates in our system, call rates, seasonality, etc. With each load, our algorithm is improving which leads to better accuracy.
If you have questions for our Tech Team, we’d be happy to answer them in upcoming posts! Tweet at us with the hashtag #TalkToTech.