Called the sexiest of this century by the Harvard Business Review, the data scientist profession remains as hot as it is rare in Brazil. Hot because demand continues to expand even with the Brazilian labour market in retraction. “Crisis? What a crisis,” a qualified data scientist is likely to say.
Rare because it is not so easy to find professionals who meet the triad required by profession: knowledge of programming, statistics/mathematics and business acumen.
“It is a new career in Brazil, and we are seeing an exponential increase in the demand for professionals. On the other hand, we still don’t have many people being trained in this area”, says Henrique Gamba, general director of Yoctoo, specialized recruitment in IT.
According to him, the crisis has not reduced salaries for these professionals: from R$9,000 for those with between 3 and 4 years of experience and up to R$22,000 for a specialist in the field. “It is because the generation, capture and storage of data, in an ever-increasing volume (big data), is the key to the direction and strategy of any business”, says Gamba.
The companies that hire data scientists the most are technology solution providers (big bata) and those that work intensively with data, such as financial institutions, research institutes, the internet, e-commerce, credit bureaus, etc.
The different phases “The heart of a data scientist’s activity is to analyze a mass of information to make inferences”, says Lucas de Paula of Neoway, a business intelligence solutions company.
Data is the raw material of the scientist. But your first challenge is to think of the right question to ask to get the answer the business needs to answer.
It is from this question that the work of capturing and preparing the data begins. With the terrain prepared, it’s time to apply mathematical formulas. For this, statistical models are elaborated, and hypotheses are created and tested.
“He uses the raw material (data) to extract insights, generate these hypotheses that will be put to the test with statistical and mathematical work”, says Lucas. The scientific validation of these inferences is what will, in fact, bring the answer of value to the business. The next step is to present the findings to external and internal customers. “Knowing how to communicate is equally important. It is often necessary to convince the customer that the model is accurate,” he says.
The analysis of the data and its conclusions is like combining the notes playing the piano and producing a song, compared to Lucas. But to play this piano, you must first charge it and get it ready for use, which, in the language of data science, means preparing the collected information.
“The difference between a statistician and a data scientist is that the former needs the data ready to work, and the latter has the versatility to do this preparation; he has the so-called hacking skills, which the statistician does not have.
Combination of skills Because it is extremely difficult to combine all these skills, the most common thing is to form teams with professionals who complement each other.
“We mix and build teams. A computer professional is in charge of doing the preparation, works together with whoever does the mathematical analysis and we also embed business knowledge”, says Monica Tyszler, director of solutions and services at SAS Latin America.
The lack of any of the skills compromises all success at work. “If there is expertise in computing and mathematics, but none in business, it will not be possible to find out what problem the company needs to solve,” says De Paula.
More expertise in computing and business and none in mathematics and statistics will detract from the rigour of analysis. And finally, the lack of computer knowledge prevents the extraction of a significant volume of data, which is what, ultimately, has the ability to bring value in terms of business.
At Neoway, De Paula works in pairs to ensure all skills are on the table. “For example, here we have a physicist and an econometrics specialist who work together,” she says.
The most frequent training In the United States, some universities already have highly regarded training programs. But because it is a new activity, most of the professionals who work today in the area come from other academic backgrounds.
According to the general director, statistics, engineering, mathematics, and physics are quite frequent in the curricula of professionals. “It is important to note that courses such as master’s and doctoral degrees are almost always mandatory”, says Gamba. According to him, careers that facilitate migration to the area are business intelligence, statistics, or technology.
In the opinion of the director of SAS, training in production engineering linked to computer engineering provides a good foundation for anyone who wants to pursue a career in data science. “There are programming classes, statistical mathematics, research and there is business knowledge applied to production engineering”, she explains.
Knowing how to program and having knowledge of database management systems is essential. Most of the languages used in this branch are open source, that is, open-source. At Neoway, teams work with the languages Python, R, Spark, Scala, and Go and the database managers MongoDB, SQL, Elasticsearch, Neo4j, Cassandra and the Apache Kafka messaging system.
SAS is one of the companies that invest in the qualification of professionals. “We started with a data scientist training course in the United States and the idea was to bring it to Brazil”, says Mônica. Big data management techniques, advanced analytics, data visualization, machine learning and communication techniques are also essential for data scientists; the SAS Academy for Data Science is an option for anyone looking for certification in this area.
Also Read : What Is Open Space Technology?