be achieved with the help of Social Network Analysis (SNA).
2.5 Scope of Python in SNA
Python is trending, as well as the most demanding knowledge in recent years. Python is also the most wanted language. The community is growing quite fast. In the past decades, social networks were analyzed using frameworks like c. As we know, scope and applications of social networks are increasing drastically [9]. It is not practically feasible that domain experts from different fields also need expertise in programming to implement their ideas of network analysis; however, learning python as a tool to empower their ideas supporting visions is realizable [10].
This section introduces the prominent syntax and syntax styles of python, as well as different library packages and its significance [11].
2.5.1 Comparison of Python With Traditional Tools
1 Python is free open source, whereas MS Excel is a paid package.
2 Python is easy for a complex equation and a huge data set, and Excel is good only for a small data set.
3 Since python is an open source, anyone can audit or replicate a work that is not possible in Excel.
4 Finding errors and debugging it is a lot easier in Python than Excel.
5 Excel is way simpler to use than Python, i.e., the user does not need any programming knowledge.
6 Repetitive tasks can be easily done by automation, which is not possible in Excel.
7 Python provides in-depth visualizations, whereas Excel has basic graphs [12].
2.6 Installation
Python installation consumes a bit more time because it should be properly downloaded in the right environment with all the necessary packages [13]. The standard version of python can be installed from the following link [https://www.python.org/downloads/].
Different versions of Python with respect to the type of OS (Windows, Mac, Linux) can be found under this link.
Some important package for SNA is pandas, matplotlib, and NetworkX. All these packages can be installed via pip installation.
– pip install pandas
– pip install network
– pip install matplotlib
NetworkX is an important library used to analyze social network in Python [14]. The package is mainly created to analyze the functions of complex graph structure. It is a free package under BSD license.
Figure 2.7 Python official documentation.
2.6.1 Good Practices
1 It is always advised to install virtual environments like Anaconda environment. Miniconda can be used instead of anaconda if the computer has less than 5 Gb ram [15]. You can download the standard version of Anaconda here [https://docs.anaconda.com/anaconda/install/].
2 Choosing editors, such as VS code or pycharm or IntellIj or Jupyter Notebooks, and so on, comes along with the Anaconda environment.
3 Proceed with open-source version at the beginning. Use Anaconda Navigator→ interactive Visual mode Or Prompt Terminal Mode:– Creating new environments in Anaconda: conda create— name myenv– Replace myenv with the environment name.– Activate Environment: conda activate myenv– Installing packages: conda install [packagename]
The more useful resources and explanations on working with conda environment can be found in their official documentation.
Figure 2.8 Anaconda navigator.
Figure 2.9 Conda environment installation.
2.7 Use Case
Some interesting case studies based on SNA are Facebook friends’ group and terrorist activities [16]. The case study has been worked in python with Jupyter notebook. You can download and explore the data set to get more insight under the following link.
Scan the QR code and follow the Github link to access the worksheets.
Figure 2.10 QR code for workbooks and source codes.
2.7.1 Facebook Case Study
The first important steps in analyzing any kind of data set in python is importing libraries. The data to be analyzed can be scrapped directly from the respective site or it can be accessed from the API provided by the website [17]. Choosing the data mainly depends on the need, i.e., why do we need to analyze the data? What is the purpose? What kind of problem are we solving? [18]
Step 1: Import libraries
Each library has their built-in function, which makes Python easy to code.
Figure 2.11 Code blocks for importing libraries.
Step 2: Read data
Pandas is used to retrieve the data and can be used to explore a huge data set conveniently.
Figure 2.12 Code block for reading data.
Step 3: Data cleaning
Data cleaning means removing/cleaning the noise (NaN, Missing data) [19]. Data quality will have more impact in the model so using the data with less noise is recommended for better results. Missing values can be altered by generating the mean, median value and so on [20–22]. It completely depends upon the type of data.
Step 4: Read input
read_edgelist is a built-in function in NetworkX library. More details about it can be found in the documentation website. [23]
Figure 2.13 Code block for reading edge list.
Step 5: Visualizing the network
Figure 2.14 Visualization of Facebook users.
Step 6: Centrality measures
Figure