Popular Data Analysis Tools:Introduction
- Vicky Costa
- 12 de abr. de 2024
- 2 min de leitura
In the current scenario, where the amount of data generated continues to grow exponentially, the use of appropriate tools for analysing data is fundamental to extracting valuable insights and making informed decisions. In this report, I share some of my experience of the main tools used and requested in data analysis, including Python, Power BI, Excel, R and SQL.
Python (with Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn)
I used Python for data analysis in a sales project using Pandas to clean and manipulate a large set of transactional data. I used NumPy to perform numerical calculations, Matplotlib and Seaborn to create clear and informative visualisations that helped identify sales trends over time. Also, although I didn't use it, it is possible to use Scikit-learn to develop a machine learning model that predicts customer buying behaviour patterns.

Power BI or Tableau
I have previous experience with Power BI. I've only seen Tableau in studies, but both are used to create data visualisations. For example, in a past project, I used Power BI to connect to diverse data sources, such as SQL databases and CSV files. I developed interactive dashboards that allowed users to explore sales data and product performance in different geographical regions. These visualisations made it easier to identify opportunities for optimising sales and allocating resources.

Excel
Although it's not exclusively a data analysis tool, Excel is still a widely used tool in this context. In a previous project, I worked with a large set of marketing data and used pivot tables to summarise and analyse key metrics from advertising campaigns. In addition, I created dynamic graphs and charts to visualise campaign performance over time and identify seasonal patterns.
However, for very large data sets or complex analyses, Excel can present limitations compared to other tools.
R(with RStudio, dplyr, tidyverse, ggplot2)
Although I don't yet have extensive experience with R, I'm looking forward to exploring its functionalities, especially after working extensively with Python. RStudio, along with packages such as dplyr, tidyverse and ggplot2, offers a robust environment for statistical analysis and data visualisation. I would like to take advantage of the flexibility and power of R to conduct more detailed exploratory analyses and create complex graphical visualisations, similar to what I do with Python.
SQL (with MySQL, PostgreSQL, Microsoft SQL Server)
My experience with SQL has been in different sectors, including fashion, banking and consultancy. In a previous project, I worked with MySQL to extract and analyse customer transaction data in an online shop, allowing us to identify purchasing patterns and calculate sales performance metrics. The ability to write complex queries has allowed me to meet the data analysis needs of various companies, including renowned private sector organisations.
SAS
Although I don't have direct experience with SAS, I recognise its importance in the data analysis landscape. Like other tools such as R and Python, SAS is widely used in corporate and academic environments for statistical analysis, predictive modelling and data mining. It offers a variety of features for dealing with large data sets and conducting advanced analysis, making it a popular choice for organisations that need robust data analysis solutions.
Apache Spark
Although I haven't worked directly with Apache Spark, I recognise the importance of this tool in large-scale data analysis and distributed processing. Apache Spark is a fast and efficient computing framework, especially suited to dealing with large volumes of data in real time. Widely used in scenarios that require distributed and parallel data processing and analysis, Apache Spark supports big data analysis, machine learning at scale and real-time data processing. Its ability to handle large volumes of data efficiently makes it a popular choice for organisations that deal with massive data sets and need fast and accurate analysis.
Research Insights
During the process of applying for 57 data analysis vacancies, I used a variety of approaches, with the majority of applications being made through vacancies found on LinkedIn, both through posted vacancies and direct messages. I also used the Gupy platform on 3 occasions and was contacted by telephone on 2 occasions. One application was made via the company's website.
Featured tecnologies: Based on the vacancies for which I applied, the five technologies most requested by employers were:
Power BI or Tableau (43 vacancies)
SQL (42 vacancies)
Python (27 vacancies)
Excel (21 vacancies)
PowerPoint (13 vacancies)
These technologies were the ones I identified that were most sought after by employers in the data analysis vacancies I applied for.
Enphasis on languages required:
I observed a strong demand for professionals with English skills, standing out as the most common requirement among data analysis vacancies.
Location preferences:
With regard to location, I chose to apply for vacancies in Barcelona, Brazil and Spain. However, it's worth noting that the opportunities in Brazil and Spain were for remote work, while in Barcelona there was the option of face-to-face, remote or hybrid work.
Desired levels of experience:
As for the level of experience desired by companies, I observed a significant demand for full professionals, followed by juniors and seniors. It's interesting to note that of the 57 vacancies I applied for, only 6 specified salary information in the job description. This indicates a common tendency for companies not to reveal salary information explicitly in job listings.
Conclusion
I hope this report on the main data analysis tools has been useful and enlightening. As we explore Python, Power BI, Excel, R, SQL, SAS and Apache Spark, it becomes clear how important it is to adapt our skills to the demands of the ever-evolving labour market. Data analysis plays a crucial role in many industries, and mastering these tools can open doors to exciting professional opportunities.
Always remember to keep learning and exploring new techniques, as the journey in data analysis is a continuous search for valuable insights and informed decisions.
Comentários