Placement 2025 Scholarship: Your Future Starts Here | 6 Guaranteed Job Interviews | Limited to 100 seats. Apply Now

08D 05H 02M 44S

Menu

Executive Programs

Workshops

Projects

Blogs

Careers

Placements

Student Reviews


For Business


More

Academic Training

Informative Articles

Find Jobs

We are Hiring!


All Courses

Choose a category

Mechanical

Electrical

Civil

Computer Science

Electronics

Offline Program

All Courses

All Courses

logo

CHOOSE A CATEGORY

Mechanical

Electrical

Civil

Computer Science

Electronics

Offline Program

Top Job Leading Courses

Automotive

CFD

FEA

Design

MBD

Med Tech

Courses by Software

Design

Solver

Automation

Vehicle Dynamics

CFD Solver

Preprocessor

Courses by Semester

First Year

Second Year

Third Year

Fourth Year

Courses by Domain

Automotive

CFD

Design

FEA

Tool-focused Courses

Design

Solver

Automation

Preprocessor

CFD Solver

Vehicle Dynamics

Machine learning

Machine Learning and AI

POPULAR COURSES

coursePost Graduate Program in Hybrid Electric Vehicle Design and Analysis
coursePost Graduate Program in Computational Fluid Dynamics
coursePost Graduate Program in CAD
coursePost Graduate Program in CAE
coursePost Graduate Program in Manufacturing Design
coursePost Graduate Program in Computational Design and Pre-processing
coursePost Graduate Program in Complete Passenger Car Design & Product Development
Executive Programs
Workshops
For Business

Success Stories

Placements

Student Reviews

More

Projects

Blogs

Academic Training

Find Jobs

Informative Articles

We're Hiring!

phone+91 9342691281Log in
  1. Home/
  2. Adnan Zaib Bhat/
  3. File Parsing and Data Analysis in Python Part I (Interactive Parsing and Data Visualisation)

File Parsing and Data Analysis in Python Part I (Interactive Parsing and Data Visualisation)

1) File Parsing Definition: Parse essentially means to ''resolve (a sentence) into its component parts and describe their syntactic roles''. In computing, parsing is 'an act of parsing a string or a text'. [Google Dictionary]File parsing in computer language means to give a meaning to the characters of a text file as per…

    • Adnan Zaib Bhat

      updated on 09 Jan 2019

    1) File Parsing

    Definition:

    Parse essentially means to ''resolve (a sentence) into its component parts and describe their syntactic roles''. In computing, parsing is 'an act of parsing a string or a text'. [Google Dictionary]File parsing in computer language means to give a meaning to the characters of a text file as per the formal grammar. ''Within computational linguistics the term is used to refer to the formal analysis by a computer of a sentence or other string of words into its constituents, resulting in a parse tree showing their syntactic relation to each other, which may also contain semantic and other information.'' (Wikipedia). A parser is a program that parses the text files. 

    Converge File:

    A converge file is usually some thermodynamic properties file containing data points related to various properties. In this project, I will be parsing an Engine output file. The file contains 17 thermodynamic properties like crank angle, pressure, temperature, volume, etc. There are thousands of data points for each property. 

    The Converge file that I will use in this project is named 'engine_data.out' and can be found here: 

    https://drive.google.com/file/d/1L8GY56d-M8mB1KfceM-xVhGNvnCIqxjm/view 

    1.1 Data Pre-Processing

    Before one use information given in a file, it is very important to understand the given file, find the patterns and meaningful ways of data extraction. This is a part of data pre-processing. Rigorously speaking, Data preprocessing is a technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviours or trends, and is likely to contain many errors. Data preprocessing prepares raw data for further processing. (Techopedia)

    In data preprocessing, techniques like data cleansing, data integration, data transformation etc. are used. The first two techniques deal with missing and inconsistent data. Data transformation involves transforming the raw data into meaningful and usable formats. 

    Now, looking at the engine_data.out file in Fig. 1.1 below, it can be seen that the first line in the file contains Converge release name and date. The second line contains the column numbers. The third and fourth lines contain properties and their units. The fifth line is blank and the data points for each property occur from the 6th line up till the last line of the file. It is also clear that lines which do not contain data points start with the '#' symbol. (I opened the data file in WordPad).

    An excerpt from the converge engine data file                                                          Fig. 1.1: An Excerpt From The Converge engine_data.out File

    1.1.2  File Reading, Data Extraction

    In python, one of the best ways to parse a file is to use a for-loop and read a file line by line as shiwn in the following code.

    #Reading and extracting data from engine_data.out file
    
    #Extraction
    engine_lines =[]    #preallocation
    
    for line in open('engine_data.out','r'):  #r stands for read only
    	engine_lines.append(line)           #appends the lines in a list
    
    #Printing Information
    
    print('No of lines = ',len(engine_lines),'\n')  #number of lines in the list
    
    print('\n First two lines: \n') 
    print(engine_lines[0:2],)        	        #first 2 lines
    
    print('\n First data-points line: \n')
    print(engine_lines[5])       	            #first data points line
    
    print('\nClass Type\n')
    print('Class of the engine_lines variable = ',type(engine_lines))
    print('Class of elements = ', type(engine_lines[0]), type(engine_lines[5]))
                                                                                              Output 1.1. File Parsing Line-by-Line 

    From Output 1.1, we see that there are 8670 lines in the file. With the above code, all lines are individual entries stored in the list named engine_lines. It can also be seen that each element in the engine_lines list is a string. However, I need to store each data point for a particular property separately as a number and not as a string.

    1.1.3 Splitting Lines and Data Integration

    Most of the data files are text files and contain some characteristics which are used to separate data points from each other. If a file is a Comma Separated Values file, then the data points are separated by commas. Looking at the engine_data.out file, it is clear that the data points are separated by spaces. On the first glance, when checking the first data lines, one finds that there are three spaces between each data point. So, while coding, one can use this feature of the file and extract all the data points.

    For splitting the data points, using the inbuilt function .split() can be very useful. Now say, if the converge file was a CSV file, the split function could be used by writing engine_lines.split(','). As the data points are separated by (seemingly three) spaces, I could use engine_line.split('   '). [Note the three spaces between the single quotes]. However, the three-space criterion for all lines was only my assumption. While looking keenly at the file, there are certain lines, where there are more or less than three spaces in between the data points. (of course, I realised it only after getting errors). The best way is to input nothing in the split function. This way, the function automatically finds meaningful separators (at least in my case). Also, the 'non-data' lines contain the '#' symbol at the beginning. Using, the two properties of the converge file, the code to integrate data points from it is shown below.

    Raw Data Extraction Code:

    #Creating different lists for properties with respective data points
    
    
    #Defining variable types
    
    Crank = []
    Pressure = []
    Max_Pres = []
    Min_Pres = []
    Mean_Temp = []
    Max_Temp = []
    Min_Temp = []
    Volume = []
    Mass = []
    Density = []
    Integrated_HR = []
    HR_Rate = []
    C_p = []
    C_v = []
    Gamma = []
    Kin_Visc = []
    Dyn_Visc = []
    
    '''the above variables can be names anything like A, B, C etc. But, for clarity in code, it is better to write the property name for each variable'''
    
    
    for line in open('engine_data.out','r'):
    	if "#" not in line:
    		Crank.append(float(line.split()[0]))          #python counts from 0 and not 1 
    		Pressure.append(float(line.split()[1]))
    		Max_Pres.append(float(line.split()[2]))
    		Min_Pres.append(float(line.split()[3]))
    		Mean_Temp.append(float(line.split()[4]))
    		Max_Temp.append(float(line.split()[5]))
    		Min_Temp.append(float(line.split()[6]))
    		Volume.append(float(line.split()[7]))
    		Mass.append(float(line.split()[8]))
    		Density.append(float(line.split()[9]))
    		Integrated_HR.append(float(line.split()[10]))
    		HR_Rate.append(float(line.split()[11]))
    		C_p.append(float(line.split()[12]))
    		C_v.append(float(line.split()[13]))
    		Gamma.append(float(line.split()[14]))
    		Kin_Visc.append(float(line.split()[15]))
    		Dyn_Visc.append(float(line.split()[16]))
    
    plt.plot(Volume,Pressure)
    plt.show()

    Pressure-Volume Plot from the above code is given in figure 1.2 below:

                                                                           Fig. 1.2 Pressure-Volume Plot From Raw Data Extraction

    The code given in Section 1.1.3 works, but it is in no way interactive or appealing. Often, one would like to enter a given Converge engine file, select two desired properties to be plotted with proper labels and titles and be able to save the figures. That would require some complexity in coding. I will explain that step by step in the next sections.

    2) Interactive File Parsing And Data Visualisation

    The code given in section in 1.1.3 is not versatile at all. In order to make an interactive program, there are many considerations and shortcomings that may lead to the crashing of the program. For example, if a user enters an invalid file, enters a column number that doesn't exist, the program won't work and is likely to crash. I will discuss each situation along with the solutions. A perfectly working program would be one which will:

    1. Let the user input the name of the file
    2. Check the existence and validity of the file
    3. Parse the file, extract Converge version, release date, property names, and units from the first 4 lines, and
    4. Let the user select (two) properties to be plotted.

    Above all, it is important that the program

    1. doesn't crash any time
    2. re-plots unless the user decides to exit on will
    3. reruns unless the user decides to exit on will

    The program can crash when following cases occur:

    While Inputting file:

    1. The user enters a file which doesn't exist
    2. The user enters a file which exists but is not readable
    3. The user enters a file, is readable but is not a converge file
    4. While Inputting wrong column numbers
    5. While Inputting a wrong response where ever some response is needed

    In short, the program must be crash-proof.

    The criteria to determine whether a given file is a valid Converge engine file or not depends on some unique characteristic that must exist in all such files. I have selected the existence of the word 'CONVERGE' (after the # symbol) in the first line of the file as the criteria for a valid file. With the below pseudo code, it is possible to make a 'software' solution for reading, extracting and plotting data from similar files. However, there are some assumptions that need to be considered. Assuming all Converge Engine files contain:

    1. hash symbols in the first 5 lines
    2. the word 'CONVERGE' as the second element in the first  line
    3. version name and release in the first line after the word 'CONVERGE'
    4. column numbers in the second line
    5. properties in the third line
    6. units in the fourth line
    7. an empty 5th line (with # at the beginning) 
    8. data points for each property after the 6th line
    9. data points are convertible to numbers

    2.1 Pseudo-Code:

    Based on the above assumptions, the following pseudo code illustrates the program idea.

    import libraries
    
    start main Loop:
      ask for the file name from the Users
    
      if the name does not exist:
      	prompt the user to enter an existing file
    
      if the name exists:
      	try parsing the file
    
      	if not able to read file:
      		promp the user to enter a valid file
    
      	if able to read an parse:
      	  check validity by finding 'CONVERGE' in the first line
    
      	     if 'CONVERGE' not in first line:
      	     	prompt the user to enter valid file
    
      	     if 'CONVERGE' in first line:
      	     	extract labels and units
      	     	extract data columns 
      	     	extract Converge release
    
      	     	Begin Particular-File-Loop:
      	     		print the converge release
      	     		print the column names with numbers
    
      	     		prompt the user to enter a valid column for x axis
      	     		print the selected column
      	     		prompt the user to enter a valid column for y axis
      	     		print the selected column
    
      	     		plot the graph with labels and titles
      	     		try:
      	     		  to create a folder in the current directory
    
      	     		if folder exists:
      	     		 save the plot in the folder
    
      	     		else:
      	     			create the folder and then save the plot
    
      	     		ask the user whether to re-run with the same file or a different file
    
      	     		if rerun:
      	     		 then stay in Particular-File-Loop
    
      	     		else:
      	     		  go to Main Loop
    
        ask the user whether to enter a new converge file or to exit the Programs
    
        if new file:
        	stay in the Main Loop
    
        else:
        	exit the program

     

     

    2.2 Python-Code

    The Python code for an interactive file parsing and data visualisation is given below

    ### Program to Parse a Converge Engine Data Thermodynamic File###
    
    #This code checks the existence and validity of a Converge Engine Data file,
    #Then parses the valid file, extracts release version, labels, units and data points
    #Stores data points in each coloumn (a property) in separate arrays/lists
    #The arrays/lists can then be used to plot graphs between two properties at a time
    #The plots are saved in proper folders with proper names
    #User will always be prompted to enter valid inputs
    
    
    #1) Importing libraries/modules:
    
    import matplotlib.pyplot as plt   #for plotting
    import numpy as np 				  #for creating arrays
    from time import sleep 			  #for interactive time delays
    from pathlib import Path 		  #creats path name class
    import os 						  #Files and directory module
    
    exit = 'y'			   #defining variable and setting default condition
    while exit == 'y':     #Main-Program-Loop
    	
    	
    	#2) Entering and Checking the existance File:
    
    	#   2.1) Inputting the file name:
    	file_input = input(' \n\n Enter the name of the Converge Engine Data File: ')
    	cwd = os.getcwd() 						#gets the current working directory
    	path = cwd + '\\'						#double backslash is interpreted as single  
    	file_path = path + file_input 			#full path name
    	file_path_find = Path(file_path) 		#class = pathlib.WindowsPath
    	
    
    	#   2.2) Checking the existance of file in the directory:
    
    	exist = file_path_find.is_file() 	    #.is_file method returns boolean values
    	
    	#     2.2.1) Code for non existing file:
    	if exist == False:
    		print(("\n No such file '%s' exists in the directory: '%s' \n\
     Make sure you enter the file name (case sensitive) along with the exstenstion correctly.\n\n" %(file_input,path)))
    		sleep(2)                                                     #for an interactive pause
    	'''A backslash is used to indicate the compiler a line continuity'''
    	
    	
    	#     2.2.2) Code for existing file:	
    	if exist == True:                     #here, 'if exist == True:' can be replaced by an 'else:' statement
    
    		all_columns = []    #defining the type/preallocation
    
    		#3) Extacting and Checking Validity of file:
    
    		try:      #try whether the file is readable or not
    			#  3.1) Reading file and Extracting content:
    			for line in open(file_input,'r'):     #read ('r') file line by line
    				separate = line.split() 		  #splits the line automatically
    				all_columns.append(separate)	  #store each 
    
    
    			
    			#  3.2) Checking the validity of converge file:
    
    			#     3.2.1) Invalidity Test
    			if 'CONVERGE' not in all_columns[0][1]:
    				print('\n\n You have entered an invalid or corrupt file. Please enter a valid CONVERGE Enigne Data Output File. \n')
    				sleep(2)
    				print('Itried not')
    				'''
    				The key concept here is to identify a certian unique word or words that will only be
    				containedin a converge release file at a particular location. The above criteria is,
    				of course valid only if we assume that all Converge engine data files are similar in 
    				format and, that all such files have 17 columns,first line contains the word CONVERGE
    				and then the Release etc and lastly that the data points start from the 6th line.
    
    				There are four cases that may arise:
    				1) Entered file is valid and passes the validity test. This is desired.
    				2) Entered file is invalid and doesn't pass the validity test. This is also desired.
    				3) Entered file is invalid and passed the validity test.
    				4) Entered file is not readable, like an image file.
    				In 3) and 4) cases, the program again prompts to enter a valid file
    				'''
    			
    
    			#    3.2.2) Validity Test:
    				''' if the file isn't invalid, it is, consequently valid'''
    
    			else:
    				#print('\n You have entered a valid Converge Engine Data File. \n')
    					
    
    				#4) Extraction: 
    
    				#   4.1) Labels/Property names and Units
    				label_columns = all_columns[2:4]              
    				#Property name and units are in the 3rd and 4th lines respectively
    				
    				del(label_columns[0][0],label_columns[1][0])  #deleting the pound symbols
    				
    
    				'''Note: The names and units are contained in lines while the data points
    				for each property in columns'''
    				
    				#   4.2) Data points (lines)
    				data_columns = all_columns[6:]             #selects the only lines with data points
    
    				
    				#5) Grouping:
    
    				#  5.1) Defining/Preallocating (lists)
    				
    				''' list variables can be named anything Like A,B,C etc for simplicity
    				but, it is obvious naming them properly makes code understandable more easily'''
    
    				Crank = []
    				Pressure = []
    				Max_Pres = []
    				Min_Pres = []
    				Mean_Temp = []
    				Max_Temp = []
    				Min_Temp = []
    				Volume = []
    				Mass = []
    				Density = []
    				Integrated_HR = []
    				HR_Rate = []
    				C_p = []
    				C_v = []
    				Gamma = []
    				Kin_Visc = []
    				Dyn_Visc = []
    
    				#   5.2) Grouping data into respective columns:
    				for i in range(len(data_columns)):
    					Crank.append(float(data_columns[i][0]))           
    					Pressure.append(float(data_columns[i][1]))
    					Max_Pres.append(float(data_columns[i][2]))
    					Min_Pres.append(float(data_columns[i][3]))
    					Mean_Temp.append(float(data_columns[i][4]))
    					Max_Temp.append(float(data_columns[i][5]))
    					Min_Temp.append(float(data_columns[i][6]))
    					Volume.append(float(data_columns[i][7]))
    					Mass.append(float(data_columns[i][8]))
    					Density.append(float(data_columns[i][9]))
    					Integrated_HR.append(float(data_columns[i][10]))
    					HR_Rate.append(float(data_columns[i][11]))
    					C_p.append(float(data_columns[i][12]))
    					C_v.append(float(data_columns[i][13]))
    					Gamma.append(float(data_columns[i][14]))
    					Kin_Visc.append(float(data_columns[i][15]))
    					Dyn_Visc.append(float(data_columns[i][16]))
    
    					'''The key idea here is that for each iteration (for each line, denoted by
    					'i'), the loop appends to each (property) list a data point. For example 
    					data_columns[i][7] will always append the 8th entry from each line. This way,
    					data points belonging to Volume only will ve stored in the Volume array.
    					'''
    
    				DATA = [Crank,Pressure,Max_Pres,Min_Pres,Mean_Temp,Max_Temp,Min_Temp,Volume,Mass,\
    	 Density,Integrated_HR,HR_Rate,C_p,C_v,Gamma,Kin_Visc,Dyn_Visc] #GRoups each file column into a list
    				
    				#  5.3) Converting Arrays into absolute units
    
    				''' The three columns belonging to pressure in the file are in MPa, while other are
    				in absolute units. Also, if needed, the Crank angles can be convertred to raidans
    				with np.radians() command. The other way is to multiply pressure arrays in Section
    				5.2 e.g., Max_Pres.append(float(data_columns[i][2])*10e6)
    				'''
    
    				#for i in range(1,4):
    				#	DATA[i] = 1e6*np.array(DATA[i])    
    				'''by converting a list to numpy array, elementwise operations can be done
    				Mega = 10^6 which in python is written as 1e^6
    				However, when plotting, the units for pressures will be in MPa already.
    				Only while calculating, shall we need to multiply 10^6 to pressure arrays
    				'''  
    				   
    	 
    
    	 			# 6) Prompting Columns to be plotted from the user:
    
    				rerun = 'r'                      #defining variable and setting default condition  
    
    				while rerun == 'r':              #Particular-File-Loop (let's call it that)
    
    					# 6.1) Creating converge file name, version and column values for display:
    
    					version = ' '	               #defining variable
    					for i in range(1,5):
    						version = version + ' ' + all_columns[0][i] 
    					print('\n\n\n',(' '*10 + '*'*10)*5)
    					print('\n\t\t\t Current File: %s' %(file_input+version))   
    					#\n creates new line and \t creates a tab space
    					
    					for i in range(len(label_columns[0])):
    						print('\t',label_columns[0][i],'=',i+1)    
    
    						'''
    						Note: the variable 'i' used in the loops can be reused to save memory.
    						However, if the value of i is to be used after the loop ends, say as a
    						counter etc, then different loops should use different variables
    						'''
    					
    					# 6.2) Prompting for the first column, X-axis:
    
    					'''
    					If the user enters a float, char or string, the program can crash. Also,
    					the input() accepts strings by default. Using try-execpt this can be fixed
    					'''
    					
    					x ='anything'                     #anything but an integer between 1 and 17
    					
    					while x not in list(range(1,18)):       #because there are only 17 columns
    
    						x = input('\n\n Please Enter the column number (X-axis): ')
    
    						try:
    							x = int(x)                      
    							if x not in list(range(1,18)):
    								print(' Invalid Number. Accepted Values (1-17)')
    						except:
    							print(' Invalid Number. Accepted Values (1-17)')
    						
    					'''
    					The above code will try if the input value can be converted into an interger.
    					Then, test if the integer lies 1 and 17. If yes, this condition will satisfy
    					both 'try condition' as well as while loop. Otherwise,it will keep displaying
    					the error message  and keep prompting for a valid number from the user
    					'''
    					
    					x = x - 1                            #because python counts from 0 :)
    					print('\t', label_columns[0][x])     #prints the selected column 
    					
    			        
    			        # 6.2) Prompting for the second column, Y-axis:
    
    					y = 'anything'
    					while y not in list(range(1,18)):
    						y = input('\n Please Enter the column number (Y-axis): ')
    						try:
    							y = int(y)
    							if y not in list(range(1,18)):
    								print(' Invalid Number. Accepted Values (1-17)')
    						except:
    							print(' Invalid Number. Accepted Values (1-17)')
    
    					y = y - 1
    					print('\t',label_columns[0][y])       
    
    
    					
    					# 7) Plotting:
    
    					#   7.1) Creating Title, axes labels 
    					
    					title = 'Plot of ' + label_columns[0][y] + ' Vs ' + label_columns[0][x]
    					x_lab = label_columns[0][x] + label_columns[1][x]
    					y_lab = label_columns[0][y] + label_columns[1][y]
    					
    					#   7.2) Creating The Folders and filename for the plot.
    					
    					folder = path + 'File Parsing\\Plot Figures\\'
    
    					'''Needed to create the folder for the first time. If the folder already
    					exists, it will move to except'''
    					
    					try:
    						os.makedirs(folder)                            
    						#makedirs makes folders and subfolders. mkdir, only a single folder
    						plot_filename = folder + title + '.jpeg'          
    					
    					except:
    						plot_filename = folder + title + '.jpeg' 
    						#'png' or any image format can be used
    
    
    					#   7.2) Plotting the figure:
    					plt.figure()
    					plt.plot(DATA[x],DATA[y])
    					plt.xlabel(x_lab)
    					plt.ylabel(y_lab)
    					plt.title(title)
    					plt.savefig(plot_filename)    
    					plt.show()                 
    					#Note: savefig() must be placed before show(), else, a blank image is saved
    
    					
    					
    					#8) Rerunning the program with the current converge file:
    
    					'''
    					If the user wants to plot again with the same file, the program should not ask
    					again for the converge file. Thus, the user explicitly has to declare whether
    					the current file is to be used again or another file is to be used
    					'''
    
    					rerun = input('\n Press R to rerun or H to exit to home (R/H): ')
    					rerun = rerun.lower()                    
    					#lower() converts string to lower case. User can enter R,r,H or h
    
    					while rerun != 'r' and rerun != 'h':
    						print('Invalid Input')
    						rerun = input('\n Press R to rerun or H to exit to home (R/H): ')
    						rerun = rerun.lower()
    
    					'''
    					prompting the user to enter either R,r or H,h only
    					'''
    		
    					if rerun == 'h':                        
    						#entering h will satisfy the 'Particular-File-Loop' and break it    
    						print('\n Exiting to Home... \n\n')
    						sleep(1)
    
    					'''entering h will satisfy the 'Particular-File-Loop' and break it, and
    					thus, return to the Main-Program-Loop'''
              
    		except:
    			print('\n You have entered an invalid or corrupt file. Please enter a valid CONVERGE Enigne Data Output File. \n\n')
    			sleep(2)
    			
    	#9) Rerunning the program with a new file:
    
    	exit = input('\n\n Press Y to enter Converge file or N to exit (Y/N): ')
    	exit = exit.lower()
    
    	while exit != 'n' and exit != 'y':
    		print('  Invalid Input')
    		exit = input('\n\n Press Y to enter Converge file or N to exit (Y/N):')
    		exit = exit.lower()
    
    	
    	'''#entering n will satisfy the 'Main-Program-Loop' and break it and thus, terminate, or exit the program''' 
    	
    	if exit == 'n':
    		print('\n Exiting Program...')
    		sleep(1)
    
    '''The (Y/N) prompt will be dispalyed everytime when non-existing file case arises, invalid converge file is input
    or when the user returns to home after plotting'''
    
    
    
    

    The program has freedom of the number of data lines in the file. Also, the program is independent of the file location, as long as the converge file also exists in the same folder.

    The only limitation of this program is that a user will not be able to plot properties which have a column number higher then 17 (if there are any). This is because I will be using only 17 variables to store the columns and, any column higher than 17 will get parsed but not stored in any separate variable. Also, I have set valid column numbers between 1-17 only.  

     

    2.2.1 Types of files

    I have stored the python file named 'engine_data.py' in a particular folder named Data Analysis. Along with it, I have copied the valid Converge file 'engine_data.out', an image file named image_png, a pdf file named 'engine.pdf', a copy of the converge file named 'non_converge.out' with 'CONVERGE' erased from the first line and a copy of Converge engine file named 'more_colmns.out' with an additional incomplete 18th column. Fig. 2.1 - 2.3 shows the various files in the directory. 

     

                                                                  Fig. 2.1: Contents In The Folder Where Python Program Is Stored.
     
                                                 Fig. 2.2: Changing Converge To Diverge (rest of the data is the same as that of the valid file)

     

                                        Fig. 2.3: File Containing and Extra Column (rest of the data is the same as that of the valid file)

     

    2.2.2 Working

    I will run the program through the following steps:

    1. enter a non-existing file
    2. enter an image file
    3. enter a wrong response
    4. enter a pdf file
    5. enter the non_converge file



    6. enter the valid converge file
    7. enter Crank as the x-axis and Volume as the y-axis 


                                                                                                 Fig. 2.4: Plot of Volume Vs Crank 
    8. re-run with the same file
    9. try to plot PV diagram


                                                                                             Fig. 2.4: Volume Vs Pressure  Plot
    10. go to home (Main Program Loop)
    11. enter the more_columns file
    12. try to plot the 18th column
    13. plot Pressure Vs Volume plot correctly
    14. go home



    15. exit the program
      (the program exits just after hitting n)

     

    The program should have created a folder named File Parsing. In it, a sub-folder named Plot Figures and in it the figures that I generated from the program. 

                                                                                                             Fig. 2.5: Creation of Folder 'File Parsing'
                                                                                                                Fig. 2.6: Creation of Sub Folder
                                                                                                               Fig 2.7: Plot Figures

    I have made a video of the working of the program (below)

     

     NOTE: The final PV plot was created from the more_columns.out file, which even though is an invalid file, nonetheless contains valid data for 17 columns. 

    For the Engine Performance, check out the second part of this project:

    File Parsing and Data Analysis in Python Part II (Area Under Curve and Engine Performance).

    [link: https://projects.skill-lync.com/projects/File-Parsing-and-Data-Analysis-in-Python-Part-II-Engine-Performance-82072]

     

                                                             ***END***

    Leave a comment

    Thanks for choosing to leave a comment. Please keep in mind that all the comments are moderated as per our comment policy, and your email will not be published for privacy reasons. Please leave a personal & meaningful conversation.

    Please  login to add a comment

    Other comments...

    No comments yet!
    Be the first to add a comment

    Read more Projects by Adnan Zaib Bhat (17)

    File Parsing and Data Analysis in Python Part I (Interactive Parsing and Data Visualisation)

    Objective:

    1) File Parsing Definition: Parse essentially means to ''resolve (a sentence) into its component parts and describe their syntactic roles''. In computing, parsing is 'an act of parsing a string or a text'. [Google Dictionary]File parsing in computer language means to give a meaning to the characters of a text file as per…

    calendar

    15 Jan 2019 02:28 PM IST

      Read more

      File Parsing and Data Analysis in Python Part I (Interactive Parsing and Data Visualisation)

      Objective:

      1) File Parsing Definition: Parse essentially means to ''resolve (a sentence) into its component parts and describe their syntactic roles''. In computing, parsing is 'an act of parsing a string or a text'. [Google Dictionary]File parsing in computer language means to give a meaning to the characters of a text file as per…

      calendar

      09 Jan 2019 02:59 AM IST

        Read more

        File Parsing and Data Analysis in Python Part II (Area Under Curve and Engine Performance)

        Objective:

        1) Integration/Area Under Curve 1.1 PV Diagram  In thermodynamics, a PV diagram is a plot which shows the relationship between the pressure and volume for a particular process.  We know that dw=p.dvdw=p.dv is the small work done by the process at a particular instance. Hence, total work done by a process from…

        calendar

        08 Jan 2019 06:07 AM IST

          Read more

          Constrained Optimisation Using Lagrange Multipliers

          Objective:

          Problem: Minimize: 5−(x−2)2−2(y−1)25-(x-2)2-2(y-1)2; subject to the following constraint: x+4y=3x+4y=3 1) Lagrange Multipliers Lagrange multipliers technique is a fundamental technique to solve problems involving constrained problems. This method is utilised to find the local minima and maxima subjected to (at least one) equality…

          calendar

          22 Dec 2018 06:32 PM IST

          • MATLAB
          Read more

          Schedule a counselling session

          Please enter your name
          Please enter a valid email
          Please enter a valid number

          Related Courses

          coursecard

          Design loads considered on bridges

          Recently launched

          10 Hours of Content

          coursecard

          Design of Steel Superstructure in Bridges

          Recently launched

          16 Hours of Content

          coursecard

          Design for Manufacturability (DFM)

          Recently launched

          11 Hours of Content

          coursecard

          CATIA for Medical Product Design

          Recently launched

          5 Hours of Content

          coursecardcoursetype

          Accelerated Career Program in Embedded Systems (On-Campus) Courseware Partner: IT-ITes SSC nasscom

          Recently launched

          0 Hours of Content

          Schedule a counselling session

          Please enter your name
          Please enter a valid email
          Please enter a valid number

          logo

          Skill-Lync offers industry relevant advanced engineering courses for engineering students by partnering with industry experts.

          https://d27yxarlh48w6q.cloudfront.net/web/v1/images/facebook.svghttps://d27yxarlh48w6q.cloudfront.net/web/v1/images/insta.svghttps://d27yxarlh48w6q.cloudfront.net/web/v1/images/twitter.svghttps://d27yxarlh48w6q.cloudfront.net/web/v1/images/youtube.svghttps://d27yxarlh48w6q.cloudfront.net/web/v1/images/linkedin.svg

          Our Company

          News & EventsBlogCareersGrievance RedressalSkill-Lync ReviewsTermsPrivacy PolicyBecome an Affiliate
          map
          EpowerX Learning Technologies Pvt Ltd.
          4th Floor, BLOCK-B, Velachery - Tambaram Main Rd, Ram Nagar South, Madipakkam, Chennai, Tamil Nadu 600042.
          mail
          info@skill-lync.com
          mail
          ITgrievance@skill-lync.com

          Top Individual Courses

          Computational Combustion Using Python and CanteraIntroduction to Physical Modeling using SimscapeIntroduction to Structural Analysis using ANSYS WorkbenchIntroduction to Structural Analysis using ANSYS Workbench

          Top PG Programs

          Post Graduate Program in Hybrid Electric Vehicle Design and AnalysisPost Graduate Program in Computational Fluid DynamicsPost Graduate Program in CADPost Graduate Program in Electric Vehicle Design & Development

          Skill-Lync Plus

          Executive Program in Electric Vehicle Embedded SoftwareExecutive Program in Electric Vehicle DesignExecutive Program in Cybersecurity

          Trending Blogs

          Heat Transfer Principles in Energy-Efficient Refrigerators and Air Conditioners Advanced Modeling and Result Visualization in Simscape Exploring Simulink and Library Browser in Simscape Advanced Simulink Tools and Libraries in SimscapeExploring Simulink Basics in Simscape

          © 2025 Skill-Lync Inc. All Rights Reserved.

                      Do You Want To Showcase Your Technical Skills?
                      Sign-Up for our projects.