source_code package
Submodules
source_code.classes module
- class source_code.classes.research_topic(name, reference_paper_eid, keywords)[source]
Bases:
object
This class allows creating the object of user-defined topic and extract the relevant papers (thanks to its functions) from Scopus
- analyze()[source]
This function does the principal analysis decsribed in the documentation.
INPUT: self: an empty object of research topic, created earlier in in the __init__
OUTPUT: self: a filled object of research topic
Some excel files: - Figure.html : an interactive netwrok graph repersenting the paper population - Topic_name_outputs.xlsx: a lsit of papers corresponding to a research topic - graph_df.xlsx : a network graph in excel format.
Note that one of columns in graph_df is named as ‘Direction’ having 1 or 2 values 1 means the paper from primary list cites the paper from secondary list 2 means that the paper from SECONDARY list cites the paper from primary list
source_code.functions module
- source_code.functions.calculate_connections_number(graph_df, paper_population)[source]
This function calculates the number of connections per each paper in a population of papers.
INPUT: - graph_df: a dataframe generated in the function ploting_connection_graph() - paper_population: a population of papers (a list of eids)
OUTPUT: - df: an intermediate column connections: a dataframe (paper’s metadata+ number of connections inside of population)
This function extracts only papers corresponding to the given topic. The criterion for keeping or not the paper is a presence of predefined keywords in a paper’s title, abstract or among author keywords.
INPUT: - eids_list: a list of papers for processing on keywords - keywords: user defined keywords - publications_with_errors: list where a publication is saved if
any error occurs during the processing
OUTPUT: - eid_list: a list of kept papers corresponding to the given topic
- source_code.functions.creating_connection_graph(name, paper_population, publications_outside_scopus)[source]
This function creates a network graph of publications and saves it it in xlsx format.
INPUT: - paper_population - a population of papers (a list of eids) - publications_outside_scopus - list of publications outside of scopus (required for get_EIDS)
OUTPUT: - graph_df.xlsx - excel table, representing the created graph
- source_code.functions.get_EIDS(paper_object, publications_outside_scopus)[source]
This function returns a list of eids for a given paper according to Scopus database. If paper is not in Scopus, then it is saved into publications_outside_scopus.
INPUT: - paper_object: information extracted from query using a ScopusSearch - publications_outside_scopus - a list for saving the publications which are not available in Scopus
OUTPUT: - eids_list: list of eids in Scopus database
- source_code.functions.get_cited_papers(reference_paper_eid)[source]
This function returns a list of references for a given paper.
INPUT: - reference_paper_eid: the eid of paper in Scopus e.g 2-s2.0-85101235827 OUTPUT: - cited_papers: references of given paper’s eid
- source_code.functions.get_citing_papers(reference_paper_eid)[source]
This function returns a list of citing papers for a given paper.
INPUT: - reference_paper_eid: the eid of paper in Scopus e.g 2-s2.0-85101235827 OUTPUT: - citing_papers: citing papers of given eid
- source_code.functions.get_paper_population(eids_list, paper_population)[source]
This function check if papers from eids_list already exist in a population of papers (from Scopus). If yes, the code returns the exisitng population. If not, the code adds the papers into existing population and returns it.
INPUT: - eids_list: a list of papers - candidates for adding into a population - paper_population: the current population of papers on a given topic
OUTPUT: - paper_population: updated (or current) population of papers on
a given topic
- source_code.functions.retrieve_paper_data(paper_population)[source]
This function retrieves the metadata for each paper from a given population of papers (in Scopus).
INPUT: - paper_population: a list of eids representing the population of papers
OUTPUT: - df: a dataframe with several columns (see below)