[0:00] hi everyone and welcome to a special [0:02] python tutorial where we are going to [0:04] learn how to perform web scripting so [0:07] first of all thanks to free code cam to [0:09] giving me this opportunity of being a [0:12] guest on their channel and i have a [0:14] youtube channel as well that is named [0:16] gymshape coding and you can find there [0:18] any tech related topic such as [0:20] programming language web development and [0:23] more content that i am uploading once or [0:26] twice a week so you can just go ahead [0:28] and find the link from the description [0:30] okay so in this video i'm going to do my [0:33] best to teach you anything that is [0:35] related to web scripting and i'm going [0:37] to do that with the beautiful soup [0:39] library and that is a special library [0:42] that will allow you to gather any [0:44] information you want from any website [0:46] you want okay so this website could be [0:49] your bank account could be a job post [0:52] website like linkedin this could be [0:54] wikipedia or a sports website and really [0:57] anything that you can think about so we [1:00] will start by scraping a basic html page [1:03] first just to understand the concepts [1:06] and then we will move on to scraping a [1:08] real website and by the last 15 to 20 [1:11] minutes of this tutorial i'm going to [1:13] show you how you can store the [1:15] information that we have just pulled [1:17] from this website so let's begin [1:23] great so this is the webpage that we are [1:25] going to [1:26] start web scraping and i'm going to [1:29] explain what is going on here so you can [1:31] see that we are having a basic title and [1:33] then we are having a kind of three [1:36] paragraphs so you can see that we have a [1:38] title of python and then we have a kind [1:41] of secondary title and then there is a [1:43] basic explanation about the course [1:45] itself and then we are having a button [1:48] that says start that will probably lead [1:51] us to a different page if we click on it [1:53] and then you can see that it has the [1:55] price here as well now we are kind of [1:58] repeating ourselves three times here and [2:01] this is what is responsible to that web [2:04] development paragraph and then also for [2:07] that machine learning paragraph now what [2:10] we are currently looking at it is [2:11] basically the behind the scenes of that [2:14] page so this is the html code that is [2:17] defined in order to show you that hello [2:20] start learning page and you can see that [2:23] inside our html documents all of the [2:25] code is being created with tags now [2:29] those tags are what are responsible to [2:31] display different information for you [2:33] and you can see that we have a big tag [2:35] that is called html and then inside of [2:38] that html tag we are having a head tag [2:41] and then a body tag now you can see that [2:44] we are defining a closure for each of [2:46] our tags with the forward slash here and [2:50] then you are probably going to see that [2:52] for the different tags as well now let's [2:55] expand the head tag here and then inside [2:58] of it we are seeing some meta [2:59] information that is not quite relevant [3:02] for us but we see that link tag which is [3:05] responsible to import some styling for [3:07] our page and then we can see that title [3:10] tag which is responsible to customize [3:13] our tab name and that is why you'll see [3:16] my courses over here now i will close [3:20] back the head and then i will expand the [3:23] body so the body is responsible to [3:26] display what is going to be on the page [3:28] itself it is the page's body and you can [3:31] see that we already have the h1 tag that [3:35] is created here and then between the [3:37] closure which is the area that you can [3:40] write the text for that tag we see the [3:42] hello comma start learning and then we [3:45] are having some div tags here and when [3:48] you see the tag of div this is the very [3:50] basic tag that will create some [3:54] tags in different styling so you'll see [3:57] here the class equals card what this [4:00] attribute assigning does here it is [4:03] importing the card styling and that is [4:06] why you see the kind of carding style [4:10] for each of our paragraphs over that [4:13] page and you can see that we are having [4:16] one more div inside that card class [4:18] which is called card header so this is [4:20] the styling for card header this is why [4:23] it is called that way and then the text [4:26] is python and then we have the card body [4:29] and we have the h5 tag which is a kind [4:32] of smaller header that you can display [4:35] and if i scroll right here you can see [4:38] that python for beginners text and then [4:40] the closure for hyh5 tag and we are [4:44] having a paragraph and then the a tag [4:47] which is allowing us to lead to another [4:50] page so when you see the a tag it is [4:53] basically a reference to another page [4:55] that you can visit [4:56] now this entire code that i'm currently [5:00] marking let's actually make our page a [5:02] bigger here [5:03] this entire code that i just marked is [5:06] kind of repeated three times and that is [5:09] why we see the page that we saw [5:12] previously okay so it is quite important [5:14] to understand and we are going to scrape [5:18] that page and pull some information with [5:21] the beautiful soup library now if you [5:24] are confused with the script tags here [5:26] don't because those tags are responsible [5:29] to import some javascript libraries and [5:32] that is something not relevant for us [5:34] right now okay so we are going to switch [5:36] to python now in order to apply some [5:38] basic scraping for that page so i will [5:41] go and start working on my main.pi file [5:45] and you can see that nothing is here now [5:47] before we actually start we have to [5:49] install same libraries and one of them [5:51] will be the beautiful soup so i will [5:53] open my terminal and since i'm working [5:55] with my system global interpreter i will [5:58] allow myself to install it over here and [6:02] i will go here and write pip install and [6:05] then we will write here beautiful soup [6:08] 4 so make sure that everything is [6:12] not spaced or not split it with dashes [6:14] and then i'm going to hit enter and then [6:16] you can see that it is installed [6:18] successfully and then the next thing [6:20] that i want to install will be something [6:23] that is going to be used from the [6:25] beautiful soup library and that is the [6:27] parcel method so when you work with [6:29] beautiful soup you have to specify the [6:31] method that you are going to parse html [6:34] files into python objects okay so there [6:38] are going to be different methods to [6:40] parse your html code and i heard that [6:44] the best of them could be the lxml [6:46] parser since if you work with the [6:49] default html parser it is not going to [6:51] deal well with broken html code so just [6:55] go ahead and install the lxml parcel [6:58] library and you can also do that with [6:59] pip install and then we are going to use [7:02] that when we work with the beautiful [7:05] soup so i will go here and then write [7:08] pip install lxml and then once i do that [7:12] let's wait until it's finished great so [7:15] we are ready now to go back to python [7:18] and start working with the beautiful [7:20] soup library now we have to go here and [7:25] import that beautiful soup library so it [7:27] could be a little bit confusing because [7:29] the libraries folder is created as bs4 [7:34] so that is why we are going to write [7:36] here from bs 4 import beautiful soup [7:41] like this and once i have done that i [7:44] have to figure out how i'm going to [7:47] access the content inside the home.html [7:50] file that is right there inside my web [7:54] scraping directory so in order to do [7:57] that we have to work with file objects [8:00] now if you don't know how to work with [8:02] files in python that is totally fine [8:04] because we are going to go over it and [8:07] it also might be worth to check my [8:09] channel out if i have already uploaded [8:12] how to work with files in python so i'm [8:15] going to write here [8:17] with open so this is basically a [8:20] statement that will allow me to open a [8:23] file and then read the content of that [8:26] specific file so as you can see from the [8:29] auto completion i have to specify as my [8:32] first argument the file's name so i'm [8:35] going to close the parenthesis here and [8:38] then inside here i'm going to write my [8:41] html files name now since the python [8:45] file and then the home.html file are in [8:48] the same exact directory it will be okay [8:51] just to write its name so it will be [8:53] home.html [8:55] and the second argument will be the [8:57] method that you want to apply when you [9:01] open that file in that python's memory [9:03] so you have couple of options when you [9:05] work with python files you can read them [9:08] you can write them or you can do both [9:11] and if we only want to read the content [9:14] then we somehow want to specify that we [9:17] only want to read this file so we will [9:20] open here a new string [9:22] and we will write here r so what this [9:25] tells to python is basically that i'm [9:28] going to read that file only and once i [9:31] have done that i have to write here a [9:34] variable that is going to be used inside [9:37] that code block that i just created [9:40] which is the with open so i'm going to [9:42] use the as keyword and then i'm going to [9:45] create here a variable name that is [9:47] going to be used throughout the block of [9:49] the open so it will be html underscore [9:53] file and that will be basically my [9:56] variable name and then once i do that i [9:59] will go inside the [10:01] open block and then i will write here [10:04] content [10:05] equals to html file dot read and once i [10:10] apply the read method i'm basically [10:13] reading the html file content and in [10:17] order to show you how this works let's [10:20] first print the content itself so i will [10:22] go here and print the content and then i [10:25] will run out the main dot pi and then [10:28] you can see that the information that is [10:30] printed is exactly what we saw in the [10:34] home dot html okay so [10:36] we kind of did a great job reading this [10:39] file now in my future episodes we are [10:43] going to read html files from real [10:46] websites but i just want to give you an [10:48] idea of how web scraping works in a very [10:52] basic way because when you work with [10:54] actual websites the scraping and the [10:57] information pulling is going to be quite [10:59] harder than the html file that i just [11:02] have written in order to explain the [11:05] idea of web scraping okay so i'm going [11:07] to continue on here and i'm going to use [11:11] the beautiful soup library in order to [11:14] prettify my html and work with its tags [11:18] like python objects so the way you can [11:22] accomplish that will be by creating an [11:25] instance of beautiful soup and i will go [11:28] here and create a new variable let's [11:30] call it soup and that is going to be [11:33] equal to a new instance of the beautiful [11:36] soup library now the arguments that i'm [11:39] going to specify here will be the html [11:42] file that i want to scrape so the [11:45] content of that will be the content [11:47] variable that is created up above and [11:49] then the second argument will be the [11:52] parser method that we want to use so we [11:55] will pass the password method as string [11:58] and that will be the lxml that we have [12:01] just installed previously now once i go [12:05] ahead and try to print what is inside [12:08] that soup instance it will be something [12:11] like the following so we will create [12:14] here a print statement and then we will [12:16] go with soup dot pretify so that will [12:20] allow you to see the html code in a more [12:23] pretty way and if i go ahead and run [12:27] this you can see that we see the html [12:29] content that is exactly the same like [12:33] what we saw in the home.html so we have [12:37] done a great job until now so let's [12:40] minimize back our terminal and now we [12:43] are going to get more familiar with the [12:45] special methods that are created inside [12:47] the beautiful soup library so we are [12:50] going to delete the print from here and [12:52] we are going to start working how we can [12:54] grab some specific information that we [12:57] want to grab so let's assume that we [13:00] want to grab all the html tags that are [13:03] created as h5 tags which is a kind of [13:06] header tag so we will go here and create [13:08] a new variable let's call it tags for [13:11] example and then we will go with soup [13:14] dot find and then once i go with find it [13:17] is going to search for the specific html [13:20] tag that i'm going to specify here as a [13:23] string so if i go here and write h5 and [13:27] then down below i go ahead and print the [13:31] tags the results of that will be [13:34] something like the following now you can [13:36] see that we have the entire html tag for [13:39] the h5 tag as you can see that its text [13:43] is python for beginners but if you [13:46] remember we have more than one h5 tags [13:50] that are created inside our home html [13:53] tag so if you remember from the home [13:55] file there is one here there is the [13:58] second one over there and there is the [14:00] third one over there and what that means [14:04] it means that the find method searches [14:07] for the first element and then it stops [14:10] the execution of searching for the html [14:13] tag that you are looking for now if you [14:16] want to change this behavior and not [14:18] only grab the first element then [14:21] basically you have to change your method [14:23] into find underscore all okay so that [14:27] will search for all the h5 tags inside [14:30] the content and now if i go ahead and [14:33] run that out then you can see that the [14:35] result here is quite different as we [14:38] have here a list and then you can see [14:40] that it has [14:42] python for beginners and then also [14:44] python web development and then also the [14:46] python machine learning now that could [14:49] be a great logic to bring you back all [14:53] the courses names from that webpage so [14:57] you can go here and change this into [15:00] courses [15:02] html tags okay so this is what the h5 [15:05] tags are actually responsible for and [15:08] now i can write here some different code [15:10] that will allow me to see all the [15:13] courses that are defined on our page so [15:17] we have python for beginners and then we [15:20] have python web development and then we [15:22] also have [15:23] python machine learning so we can work [15:26] with these courses html tags that stores [15:29] all the h5 html tags and write a next [15:33] program that is going to display all the [15:35] courses so we can actually create here [15:38] an iteration over the course of html [15:40] tags because it has a list so we will go [15:44] here with four course in courses html [15:48] tags and then inside of that course tag [15:51] that we are iterating we can bring only [15:54] the text attribute which is going to [15:57] display the course text itself so it [15:59] will be here course [16:01] dot text and now if i go ahead and run [16:05] our program then you can see that we [16:07] have a nice output regarding all of the [16:10] courses that are available from that [16:12] page so this could be a nice starter to [16:15] understand how you can scrape a web page [16:18] to grab some specific information you [16:20] want all right so we were able to [16:22] understand how we can apply some basic [16:24] scraping to a web page but when you are [16:27] going to deal with real websites the [16:29] html code is not going to be quite [16:31] friendly and simple like we had here so [16:35] in order to be able to access the html [16:39] code behind the scenes of some page we [16:42] have to use the inspect of any browser [16:45] so let's say that you want to grab the [16:48] price for each of the courses so it [16:51] makes sense to go with your mouse and [16:53] hover to that button and then right [16:56] click on it and then you want to look [16:58] for that inspect option and once you [17:01] open that out you will have a new pane [17:05] that is going to be opened and then here [17:07] we can see all the html code that is [17:10] responsible to display what is going on [17:13] on the left pane so you can see that we [17:16] have here let's make it a little bit [17:18] more bigger [17:19] so that will be enough and then you can [17:22] see that we have here div class card [17:25] three times which is displaying all the [17:27] different courses now when you go over [17:30] different html tags with your mouse you [17:32] can see that it is going to mark for you [17:35] the html tag that is related to it so it [17:38] is a quite important behavior that we [17:40] should understand [17:42] now let's say that we want to grab the [17:44] price for that python for beginners so [17:46] it makes sense to expand this tag and [17:49] see what is inside so i will go here and [17:53] search for that button and you can see [17:56] that this a tag is actually responsible [17:59] for that [18:00] button itself and then you can see that [18:02] its text is start for twenty dollars so [18:06] the price information is right there and [18:10] let's actually write a program that is [18:12] going to search for that python for [18:14] beginners and then we will grab the [18:17] price for that course and then we will [18:20] be able to write a nice program that is [18:22] going to include a list of all of the [18:25] courses and then the prices for each one [18:28] of them so let's go back to pycharm and [18:31] write this program so we will go here [18:33] and delete everything from here and the [18:37] first step that we probably want to do [18:40] is to be able to grab all the course [18:43] cards so it will be course [18:46] underscore cards equals to soup [18:50] that find underscore all because we [18:53] probably are looking to bring us back [18:56] all the cards so this is why you have to [18:58] use find all and not define and i'm [19:01] going to search for the div tags now it [19:04] could be much nicer if we could filter [19:08] the div tags that we actually want to [19:11] grab and store it inside our course [19:13] cards so if you noticed let's go back to [19:17] our courses page and here if i just [19:21] expand back there all the div tags you [19:23] can see that there is something that is [19:26] common for all the div tags their class [19:30] is equal to card so i can filter my div [19:33] tags by this expression right there so i [19:37] go back to pycharm and i will write here [19:41] class equals to card but now you can see [19:44] that there is an error and it is quite [19:47] important behavior to understand you [19:49] have to apply here the underscore [19:51] because the class is a built-in keyword [19:54] in python where you create python [19:57] classes so that is why you have to add [19:59] the underscore over here and then the [20:02] beautiful soup will understand that you [20:04] are relating to the class of the html [20:08] attribute okay so it is important now [20:10] since we have all the course cards [20:12] stored right in this variable then we [20:15] probably want to iterate over this list [20:18] and then search for the course name and [20:20] then the course price so let's see how [20:23] we can do that for each of our course [20:26] cards so we will start with [20:29] for loop here and that will be four [20:31] course in course cards and before we go [20:34] ahead and write some more code inside [20:36] our for loop let's actually remind you [20:39] what is inside each of our courses and [20:42] then you can see that we have h5 tags on [20:46] each of our course cards and it makes [20:48] sense to access this specific h5 tags so [20:52] we can accomplish that by going here and [20:55] then use the h5 tag as an attribute so [21:00] if i go ahead and press here dot h5 and [21:04] re run my program then you can see that [21:06] we were able to grab each of our h5 tags [21:10] that are inside the course card so it is [21:13] a quite great thing and now [21:16] if i revert this back to course again [21:19] and run that out you can also see that [21:22] inside our a tags we have the text for [21:26] start for 20 dollars and that is [21:29] repeated for all of our cards as well so [21:32] first of all it makes sense to delete [21:35] this again and right here something like [21:38] course name [21:40] equals to course [21:42] dot h5 and then here we probably look [21:46] for the text attribute of that h5 tag so [21:50] i will write here dot text and then this [21:53] course name will be responsible to store [21:56] the text [21:57] on each iteration so it is great and now [22:01] i can go here and [22:03] write course price and then this time i [22:06] will search for course dot a because the [22:09] a tag stores the information about the [22:12] course price so until now if i go ahead [22:16] and print the course name and then i [22:19] also go ahead and print the course price [22:23] then we will see the results like the [22:26] following so you can see that we have [22:28] python for beginners and then we have [22:31] the a tag itself but in this case we [22:34] look for the text of that a tag as well [22:37] so i will [22:39] minimize my terminal out and excuse me [22:41] for that i will delete that from here [22:44] and then search for the text attribute [22:46] over here as well and now i will run my [22:49] program and then you can see that we [22:51] have python for beginners and then we [22:53] have the text for each of our a tags and [22:56] now since we reached this stage it might [22:59] be a greater idea to print a sentence [23:02] like python for beginners costs 20 okay [23:05] so the way we can do that [23:08] is basically using the split method to [23:11] access that last element of that text [23:14] because the price is located as the last [23:18] word so it makes sense to go here with [23:21] split and then we will split it by the [23:24] blank so we don't have to specify [23:26] anything here and we want to grab that [23:28] last element so we are looking for -1 [23:31] index over here and now if i run it you [23:35] can see that we have the price [23:37] for each of our courses and now it might [23:40] be much nicer if we go ahead and use an [23:43] f-string to print a dynamic sentence for [23:47] each of our cursors so we will go here [23:50] with print and then we will open an f [23:52] string and then we will access the [23:54] course name so it will be course [23:57] underscore name and then we will write [24:00] costs and then we want to display the [24:03] course price so it will be cool [24:06] underscore price now if i run our [24:09] program then you can see that it [24:11] displays a nice information about each [24:13] one of the courses [24:15] now if you think about it that is a [24:17] quite nice behavior that we have applied [24:19] here because if you scrape a real [24:21] website like udemy that keeps updating [24:24] courses then it might be a great idea to [24:27] launch this program every certain amount [24:30] of time for example each week and then [24:32] you have the ability to be aware about [24:35] each of the courses that udemy has [24:37] updated on the webpage that you scrape [24:40] on so this is a quite nice behavior that [24:43] we were able to reach here [24:47] on this one we are going to scrape real [24:49] websites with the request library so i'm [24:52] going to simulate this against a website [24:55] that is going to search for job [24:57] advertisements and i'm going to bring [24:59] all the jobs from a specific website [25:02] that their main skill requirement is [25:06] python programming language and i'm [25:07] going to write a program that is going [25:10] to pull the latest published job [25:12] advertisements from a specific website [25:15] so it is going to be very interesting so [25:17] let's get started all right so one of [25:19] the first things that we must do is to [25:22] ensure that we have the request library [25:25] installed so i'm going to go down to my [25:29] terminal right in pycharm and i'm going [25:32] to write here pip install request just [25:35] to make sure that i have the request [25:36] library installed now the output for [25:40] myself could be different than yours [25:41] because you may not have the request [25:43] library but since i already have that [25:46] you can see outputs like requirement [25:49] already satisfied okay so it is quite [25:51] important now i'm going to minimize the [25:54] terminal and right here import requests [25:58] so you want to make sure that you do [25:59] that after the installation of this [26:01] library and the first thing that i'm [26:03] going to do here is to use the get [26:07] method of the request library now what [26:11] request library is doing behind the [26:13] scenes it is just requesting information [26:15] from a specific website so it is like a [26:18] real person [26:19] going to a website and requesting some [26:22] information okay so you can go with [26:24] something like the following when it [26:26] comes to request library so it will be [26:28] request dot get so you want to get [26:32] specific information from a website and [26:34] here we are going to provide an empty [26:36] string for now but later on we are going [26:39] to complete this string with the url [26:41] that we are going to web scrape against [26:44] it and i'm going to assign this to a new [26:46] variable and i will call it html text so [26:50] i'm going to make that to be equal to [26:52] this entire statement now let's go to a [26:55] web browser and look up for the website [26:57] that is going to include some job ads [27:00] okay so this is timejobs.com and this [27:04] website includes job posts about almost [27:07] everything so you can simply go down [27:09] here and search for some skill that you [27:12] own and then this will search for you [27:15] jobs that are requiring this specific [27:17] skill in that position now this video is [27:20] recorded a couple days before when i [27:23] uploaded it so if you watch this video [27:25] after a couple of months or even a year [27:27] or two since the publish date then there [27:30] is a great chance that the html elements [27:32] are going to be quite different but the [27:34] main point of this video is to teach you [27:37] all the tools to pull information from a [27:40] website just as you want and then you [27:43] can apply your own customizations and [27:45] kind of doing a reverse engineering to [27:48] the code that i'm going to write [27:50] throughout this tutorial great so let's [27:52] go here and write python so i will [27:55] receive only job posts about this [27:58] programming language and you can see [28:00] that we have this job found over there [28:04] and we have a lot of jobs that are [28:06] published so my goal here in this [28:09] tutorial would be to [28:12] let's get this closed so my goal in this [28:15] tutorial will be to bring all the jobs [28:19] that are posted a few days ago so if i [28:22] am zooming here in then you can see that [28:26] we have posted a few days ago for a [28:29] couple of posts but after i reach down [28:32] here we have posted four days ago so [28:36] this might mean that this job post is [28:39] not the most updated so i'm going to [28:42] bring all the jobs and i'm going to [28:45] condition my program to bring those [28:48] elements with the posted few days ago [28:51] text only so let's go back to here now [28:55] i'm going to bring this url from here [28:58] and i'm going to paste that in in the [29:01] empty string that we created inside the [29:04] request.get and once i have done that [29:07] what is going on inside this variable [29:09] right now is simply the request code [29:13] status okay so if i'm going to [29:16] print the i mean if i'm going to run [29:18] this program then we are going to see [29:21] the results like the following so 200 is [29:24] the convention number in web that the [29:28] request is done successfully but in [29:31] order to avoid the status code we are [29:34] going to go to here and i'm going to [29:37] accept the text only so i'm going to go [29:40] here and then write dot text okay so [29:43] this is what we have to apply here in [29:46] order to bring the html text of that [29:49] specific page and now it makes sense to [29:52] leave this variable name as it is [29:54] because it is storing the html text and [29:56] i'm going to re run this program and we [29:59] will probably receive a large [30:02] information of html so right now it is [30:05] not quite relevant but i'm just i just [30:07] wanted to show you the results so let's [30:10] continue from here okay so as you know [30:13] we are going to [30:14] create a beautiful soup instance like we [30:17] did in the previous episode and i'm [30:19] going to provide the html as the html [30:22] text variable so it will be soup equals [30:26] to [30:26] an instance of a beautiful soup and then [30:29] i'm going to write here html text as my [30:32] information that i want to scrape and we [30:35] are going to use the same parser again [30:37] like the previous episode so it will be [30:40] lxml now once i have done that it makes [30:43] sense to go back to our page and see how [30:47] we can grab only this each paragraph [30:50] from this website so the white boxes are [30:54] kind of a list of elements that this [30:57] page has provided here and i want to [31:00] look for a method that is going to bring [31:03] me all the job posts so it makes sense [31:06] to catch a certain element inside that [31:10] post and right click on it and then [31:13] click on inspect and once i have done [31:16] that you can see here so i'm going to [31:19] zoom in things a little bit [31:21] so we can see that the h3 class is [31:26] pointing to that [31:28] text over here i know that the text is a [31:31] little bit small here but just you can [31:33] see that it has a gray mark and i'm [31:36] going to go up here and then you can see [31:40] that those elements are opened up as [31:42] well so if i hover my mouse here then [31:45] you can see a green background wrapped [31:47] in the article over here i mean the [31:49] paragraph and then if i close that up [31:52] you can see that we have a lot of clear [31:55] fix job dash px and something like that [31:59] that its name is the class and our html [32:04] element here is called li so li stands [32:07] for list and then you can see that it is [32:09] inside a ul tag so this is standing for [32:14] unordered list and it is containing a [32:16] lot of [32:17] list tags inside that ul so you can see [32:20] once i close that then the entire [32:24] list of all the posts are marked with a [32:27] blue [32:28] background so i'm going to search the [32:31] element of li with that name of class so [32:34] i'm going to copy the name of the class [32:37] here and i'm going to go back to my [32:39] pycharm and i'm going to write here jobs [32:43] equals to soup dot find [32:46] underscore all and i'm going to search [32:49] for all the li's and as the second [32:53] argument it makes sense to pass here [32:55] class underscore equals to and then [32:58] inside that string i'm going to paste [33:01] that in the class name that we have [33:04] copied from the page itself so once i [33:07] have done that then we will probably see [33:10] the results of all the jobs in that page [33:14] now this doesn't mean that it is going [33:16] to bring back all the [33:19] 16 000 jobs because you can see that [33:23] this page is being paginated so that [33:27] means that it is going to bring the [33:29] results only for the first page so this [33:32] is not going to take extremely long now [33:35] if i go back to here and paste the jobs [33:39] then let's see the results before we [33:40] continue on just to make sure that [33:42] everything is okay so we can see that we [33:45] receive the results and then we see that [33:47] we have some company names and i think [33:50] that everything is quite great here now [33:52] in order to work with this [33:55] scraping project it makes sense to only [33:57] work with only one job element so i'm [34:01] going to [34:02] delete the underscore all from here and [34:05] what this means it means that it is [34:07] going to bring the first match that sees [34:11] the li tag and then the class name as [34:14] this string over here so let's change [34:17] this variable name just to job for now [34:20] okay just in order to develop our [34:22] program slower [34:23] relying on only one job post okay so [34:28] once we've done that we probably want to [34:30] search for the company name of that [34:32] specific job post so i'm going to go [34:35] back to here and i'm going to make [34:38] things bigger over here and now let's [34:41] actually go here and try to inspect what [34:44] is going on [34:45] here again so let me [34:48] zoom that out great now i'm going to [34:52] try to inspect this text over here again [34:55] and then we can see that it is inside [34:58] the li tag for sure but we can also see [35:01] that it is inside an h3 tag and it has [35:04] the class name of job list comp name so [35:08] i'm going to search for that class in [35:12] the entire page as well but speaking [35:15] about the entire page so let's go to our [35:18] pycharm you want to search for that [35:22] specific element only inside the job [35:25] itself so you see it doesn't make sense [35:28] to search for an h3 tag in the entire [35:31] page again so you can basically go with [35:35] job dot find besides soup dot find [35:38] because we want to search for that h3 [35:41] tag only inside our job so if i go ahead [35:44] and print the job here then we can see [35:47] that it only includes an html code about [35:50] only one job and i'm going to search for [35:53] this h3 tag [35:55] so let's create here a new variable and [35:58] i'm going to call that company [36:00] underscore name and we are going to use [36:02] job.find [36:04] and we are going to accept here as an [36:07] argument the h3 and then this time the [36:10] class underscore is going to be equal to [36:13] whatever this h3 tag includes as the [36:16] class name which is the job list comp [36:19] name now to debug this out and to ensure [36:24] that the results are great we are going [36:26] to print the company name [36:28] and then you can see that we receive [36:30] this [36:31] this element back and i'm going to use [36:33] here the dot text method just to bring [36:36] back the text itself now once i do that [36:40] we are going to see a weird result here [36:43] now you can see that we have some white [36:46] spaces so we kind of want to replace our [36:50] white spaces with nothing so in order to [36:54] do this one i'm going to go here and i'm [36:57] going to use the replace method and this [37:01] trick is going to avoid having this not [37:04] necessary white spaces so i'm going to [37:06] replace the spaces with nothing so i'm [37:10] going to just write here double quotes [37:12] twice i mean single quotes twice and [37:15] once i have done that and rerun our [37:18] program then you can see that the result [37:20] is going to be quite different as you [37:22] can see this text is fully aligned to [37:25] left now let's minimize back and [37:28] continue from here now we're going to [37:30] zoom out a little bit the code here just [37:33] we can see the important points like the [37:35] replacement [37:36] and let's continue from here now it also [37:39] makes sense to bring the skill [37:42] requirements other than the python [37:44] programming language because we know [37:46] that this job is only for people who are [37:50] good with the python programming [37:52] language so i'm going to go here and i'm [37:55] going to repeat myself in the same [37:57] process again and i'm going to write [37:59] here job.find and we are probably [38:02] looking for an element that is including [38:05] a text about the skill requirements so [38:09] let's search for that okay so let's go [38:11] back to our website again and i'm going [38:13] to [38:14] go here and check out what html element [38:17] is including the skills so we are [38:20] talking about this one so i'm going to [38:22] inspect inside here and we can see here [38:26] that this text is inside a span class [38:30] with the class name of srp skills so i'm [38:34] going to copy again this class name and [38:38] that time i'm going to search for the [38:40] spin elements inside my job post so i'm [38:44] going to go back to pycharm again and [38:47] i'm going to write here span so this is [38:49] the html tag that we are searching for [38:52] and again i'm going to write class [38:54] underscore equals to that srp skills now [38:58] i want to ensure the results over here [39:02] once again so you always want to [39:06] quickly print the results of whatever [39:08] html element that you want to pull to [39:10] see what other methods you have to apply [39:14] to prettify your result okay so let's [39:18] run our program again and it makes sense [39:20] to delete the print company name so [39:23] let's re-execute our program [39:27] and then you can see here that we have [39:30] some spin tag and then here we have a [39:33] strong tag which is basically created to [39:36] make our text bold when we want to type [39:39] in something so i'm just going to [39:41] guess here that i'm going to [39:44] only write here dot text and then i [39:46] expect for the results to be fine so [39:49] let's check out for that and then you [39:50] can see here that the results are quite [39:52] great so we have the python scripting [39:55] and then we have some more requirements [39:58] that are divided with commons and a lot [40:01] of white spaces again so i'm going to [40:04] apply the same method of that replace [40:06] once again like we did with the company [40:09] name so let's write here dot replace and [40:11] i'm going to replace white spaces [40:14] with nothing so let's re-execute that [40:18] out and then we can see that the result [40:20] is quite like we want and now we were [40:23] also able to grab the skills as well so [40:26] this is quite nice now if we want to [40:30] display a nice information about the job [40:33] until now then we want to go with a nice [40:36] print message here so let's try to [40:38] create a nice message so we will use an [40:40] f method here and we will also use the [40:43] triple quote method just to allow us to [40:46] write some text in separated lines as [40:49] well and i'm going to write here company [40:52] name [40:52] like this and then i'm going to write [40:56] here company name so i'm calling the [40:59] company name value by writing it inside [41:02] a curly brackets and i'm going to repeat [41:05] the same process for required skills so [41:09] it will be required skills and then i'm [41:12] going to make that to be equal to skills [41:16] variable and now if i go and execute our [41:19] program let's see if the results are [41:22] quite nice [41:23] yes so we kind of receiving a nice [41:25] information about the job info okay so [41:29] this is quite great [41:31] now if we go back to here then we want [41:35] to search for one more element so [41:38] you remember that i told you that we [41:41] only want to grab the [41:44] job post with the text of posted few [41:47] days ago so we for sure want to write [41:51] some extra code to apply this [41:53] functionality so i'm going to go here [41:56] and i'm going to inspect for that [41:58] element again [42:00] and then we can see that it is inside a [42:02] span once again but i can also see that [42:06] this [42:07] job post including some more span [42:10] tags so i have to filter out the results [42:14] again with the class name itself so i'm [42:17] going to search for that sim posted [42:21] class name and i'm going to go back to [42:24] here so we will write this time [42:27] job [42:29] published date [42:31] so it makes sense to delete the job [42:32] excuse me so it is just going to be [42:34] published date and i'm going to go here [42:37] again with job.find [42:39] and we will search for the span and then [42:42] this time the class underscore is going [42:45] to be equal to the text that i just [42:47] copied and i'm going to repeat myself [42:51] with printing the published date but [42:55] that time let's just avoid printing this [42:58] print line so i'm just going to comment [43:00] out those lines and let's see what the [43:03] published a [43:04] date text is looking like and you can [43:06] see that we have here something [43:09] a little bit weird so we have the span [43:13] here and we have also one more span [43:16] inside of the text of it so what that [43:19] means it means that we have to take some [43:21] different action than what we did [43:23] previously so this time i want to search [43:26] for the attribute of span just to get [43:29] inside that tag over here and then right [43:32] after it i want to look for the text of [43:34] that span tag so this will give me the [43:38] published date of this specific job but [43:43] i'm not going to include the publish [43:44] date inside my print message because we [43:47] only want the publish date for the [43:49] functionality to stop our execution if [43:52] the published date text is not including [43:56] the word of fuel and i'm going to code [43:59] this functionality just in a second so [44:01] you will see what i mean by what i said [44:03] all right so what i'm going to do here [44:06] is take a tricky action that is going to [44:08] bring me all the jobs from the first [44:11] page so if we paid attention then all [44:14] the job posts including this class name [44:18] so what i can do besides the find is [44:21] change that back to underscore all and [44:25] change this variable name to jobs and i [44:28] know that just now it just raised an [44:31] error here and i'm going to use here a [44:34] for loop that is going to iterate over [44:37] each element and i'm going to write here [44:40] for [44:40] job in jobs and then i'm going to create [44:44] an indentation of the entire code that [44:48] is right there so the results will be [44:51] applied for all the jobs that are posted [44:55] in the first page of the [44:57] web page that we scrape so once i hit [45:00] here the colon sign then i'm going to [45:03] create an indentation for each of our [45:06] lines like this and then the results are [45:09] going to be quite the same so let's test [45:12] that out okay i'm going to [45:14] uncomment our print line over here and [45:17] just for comfort reasons i'm also going [45:19] to print here and empty lines so we can [45:22] kind of see a division between the [45:24] different jobs and then i'm going to [45:26] delete the published date for now so if [45:30] we execute our program [45:32] that time then we are going to see a [45:35] nicer results and this is going to [45:37] contain all the job posts [45:39] from the page that we scrape against so [45:42] you can see that we have a nice [45:43] paragraph for that job post and then we [45:46] have also another one here and if i keep [45:48] scrolling up we can see a lot of them in [45:51] that output so this is quite great so if [45:54] you remember we wanted to filter out the [45:58] job posts that are not including the [46:02] word of few inside the published date [46:06] because what that means it means that [46:08] this job could be outdated so if i go to [46:12] our page again then we can see that as i [46:15] keep scrolling down we have some [46:18] text like posted six days ago and i [46:21] wanted to filter out only the jobs that [46:23] are containing the text of posted few [46:26] days ago so in order to apply this i'm [46:29] going to change the orders here a little [46:32] bit okay so i'm going to [46:34] cut this searching here and i'm going to [46:37] paste that in as the first line inside [46:41] my for loop now the reason i'm doing [46:43] this it is basically because i don't [46:45] want to continue on scraping for that [46:48] post if the publish date is not matching [46:51] my condition so it makes a lot of sense [46:54] to place this code as the first line [46:58] inside my for loop and then right here [47:01] i'm going to [47:03] write a condition that is going to check [47:05] if the word of fuel is inside that text [47:10] so it will be if [47:12] fill in [47:14] published date and again i'm going to [47:17] create an indentation for the entire [47:20] code here so you can do that with the [47:22] shift alt combined and then you can just [47:25] press tab and all the lines here are [47:29] being indented so right now if i go [47:32] ahead and execute our program then we [47:36] should see the results again like almost [47:39] the same but we also see here that the f [47:43] string is not quite nice [47:46] but i can live with that okay so it is [47:49] great that we were able to receive the [47:53] posts only that have been published few [47:56] days ago now there is no limit for what [47:59] you can do when it comes to web scraping [48:02] and what you can filter in or filter out [48:05] but basically this program deals with [48:08] how to grab some job posts with the [48:11] filters that you want to apply that [48:14] maybe sometimes may not be available [48:16] from the website itself so you can write [48:19] your own filtrations on your python code [48:22] while you scrape some information from a [48:25] specific website [48:29] so i'm going to do whatever it takes to [48:31] turn this program into a very useful one [48:34] and i'm going to do that by applying [48:36] some special functionalities such as [48:39] wrapping this entire program in a while [48:41] loop and executing this project every [48:45] certain amount of time and also apply [48:47] some filtrations to filter out the job [48:51] post that are not meeting the skills [48:54] that i own and also i'm going to throw [48:56] the results of the different job posts [48:59] into a new blank file so i can be aware [49:02] of the posts that are being posted every [49:05] certain amount of time so let's get [49:07] started all right then so let's start [49:09] with a kind reminder of the results that [49:11] we got until that point so [49:14] we run our program now and if we show it [49:18] right here you can see that those lines [49:21] are not aligned well so i'm going to [49:24] change that and i'm also going to [49:26] provide some extra information that will [49:28] show us the exact [49:30] link of the specific job that we are [49:33] iterating on so that way i will have the [49:35] ability to just click on the link and [49:38] then see more information about that job [49:41] so as a beginner i will get rid of the [49:43] formatted string in that case because [49:46] doing a formatted string with a triple [49:48] quote might not be a great idea when you [49:51] execute it with a for loop because as [49:53] you can see that it also includes the [49:55] indentations right here so i'm going to [49:58] delete this entire code here and i'm [50:01] going to write two more new formatted [50:04] strings and we will start with company [50:07] name [50:09] make that to be equal to company name so [50:12] make sure to add a column here so it [50:14] will be more friendly and then i will [50:16] write here required skills as well and [50:19] then we will write here the skills [50:22] variable now there was one more issue [50:25] with the result that we showed a minute [50:27] ago and that was the blank spaces that [50:31] are being shown as well so we can get [50:35] rid of the spaces by a special method [50:37] that is called strip and it is a special [50:40] method that you are allowed to use [50:43] inside strings and since the company [50:45] name and the skills are strings by [50:48] default i don't have to convert them to [50:50] a string so i can just call that method [50:54] like this okay and now i will show the [50:57] results of [50:58] something like the following and in a [51:01] few seconds we will see that [51:03] this is aligned way better than what it [51:06] was and i'm also going to add here more [51:09] information line that will show the link [51:13] of the job post so let's do that okay [51:16] let's go here and write this [51:19] functionality okay so we had an [51:21] unordered list that inside of that we [51:24] had some different html tags that are [51:27] called li and that stands for list and [51:30] they are actually different job posts [51:33] that are divided into different elements [51:36] inside an unordered list and then if we [51:39] hover our mouse you can see that there [51:41] are different jobs now if i go inside [51:44] one of them and i go inside a header tag [51:47] that is actually the first editor of the [51:50] li tag and then i will [51:53] go inside the h2 here and then you can [51:56] see that we have a link that could lead [52:00] us to a link that provides some extra [52:02] information about that specific job so [52:06] if i actually [52:07] go here and click on here you can see [52:10] that we receive the job description [52:13] right here so what we have to do in [52:15] order to access this link in each job [52:18] post that we are iterating on the python [52:21] code is actually going inside and header [52:24] and then going inside one more tag with [52:26] a kind of h2 as you saw me doing that [52:29] and then access that a [52:31] tag so let's do that okay i'm going to [52:34] go back to pycharm and apply this [52:36] functionality so we will go under the [52:40] skills and then we will write here more [52:43] info and that will be equal to job [52:47] dot header because this was the first [52:50] tag that we want to go inside of it and [52:53] then we want to go inside the h2 and [52:56] then inside that h2 we want to go inside [52:59] the a tag now before we go further let's [53:03] test ourselves that we have done great [53:05] job so let's print the more info in the [53:09] following way so it will be more info [53:12] and then we will call the variable in a [53:15] formatted string now let's execute our [53:18] program [53:20] and then you can see that inside the [53:22] more info we have the a href which gives [53:26] us the link about the [53:28] specific job that we are iterating on so [53:31] all i have to do here is going back to [53:34] my more info and then call that href [53:38] attribute so this time i'm going to do [53:40] that with a square bracket like in [53:42] dictionaries and then i'm going to write [53:44] here href so i will receive the value of [53:47] that attribute so if i run that one more [53:50] time [53:51] then i should see the link only and that [53:54] is what exactly happening so the result [53:57] is quite great and then you can see that [53:59] this is already better than what we did [54:01] in the last episode and we will continue [54:04] from here okay so what i want to do now [54:06] is giving the opportunity for the user [54:09] that executes this program to filter out [54:12] some skill requirement that he does not [54:15] own so we will use the input function [54:17] for that and then whatever the input is [54:19] equal to we will filter out the results [54:22] from the jobs that we are finding [54:25] right here okay so let's write this [54:28] functionality so to apply this i'm going [54:30] to create a new variable over here and [54:33] i'm going to call it unfamiliar skill [54:36] and i'm going to make that to be equal [54:38] to an input and then i'm going to [54:41] write here something like this okay so [54:43] the user could understand that he has to [54:47] provide some information in order to [54:49] execute this program and actually it [54:51] might be a great idea to print some [54:53] extra information before that input [54:56] function so it will be print [54:59] put some skill that you are not [55:03] familiar with and then right after the [55:06] unfamiliar skill input i will write here [55:10] filtering out [55:12] and we will actually make that a [55:14] formatted string and then we will write [55:16] here filtering out and then whatever the [55:20] unfamiliar skill is equal to [55:23] now what are we going to do with this [55:25] unfamiliar skill variable so that is [55:29] quite easy right we have to search for a [55:32] condition that will filter out the job [55:35] post that is including that word that we [55:39] are going to provide here as an [55:41] unfamiliar skill and what we can [55:43] actually do is search for the unfamiliar [55:46] skill world inside the skills string so [55:50] if you remember the skills is a long [55:53] string that is divided with commas so we [55:56] can go with a condition like the [55:59] following so it will be if unfamiliar [56:02] skill [56:03] not inside the skills that we are [56:07] grabbing in the each job post that we [56:10] are iterating and now all what we have [56:12] to do here is creating the indentation [56:15] for the different print lines okay so [56:17] now i should see the job posts that are [56:21] not including the unfamiliar skill that [56:24] i'm going to provide so just to test [56:27] that out let's [56:29] run our program twice okay so in the [56:32] first we are going to write here linux [56:35] as a skill that i'm not familiar with [56:38] and you can see that we don't see [56:40] anything that is including the keyword [56:43] of linux over here but let's actually [56:46] take that to a next level and test that [56:48] out so we see here a specific java post [56:52] that is including django so let's say [56:55] that i am not familiar with django and [56:57] see next time if i see that job post [57:00] with this company so let's re-execute [57:04] our program and that time i will write [57:07] django [57:08] and let's see the results so we can see [57:11] that we don't have any job with django [57:15] but we do have linux that time so [57:18] this condition works well and we will [57:21] continue on to next step from here now [57:24] what could be an exciting challenge for [57:25] you guys is to write an algorithm that [57:28] will accept more than one unfamiliar [57:31] skills so you want to accept multiple [57:33] inputs from a user and it might be more [57:36] challenging but i think you should try [57:39] to spend some time on something like [57:40] this because i think this could be an [57:42] amazing challenge for everyone who is [57:44] watching this video all right so now we [57:46] are going to save each job post in a [57:49] different file so besides printing this [57:52] in the terminal then we are going to [57:55] write this entire information in a [57:58] separated file and then i will also [58:00] allow this program to run every 15 [58:04] minutes or every 10 minutes up to you [58:06] and i will show this logic as well so [58:09] first of first it makes sense to wrap [58:13] our entire program in a function and i'm [58:16] going to do that by collecting [58:19] everything that is kind of pulling the [58:22] information from the website and i'm [58:25] going to indent everything one step [58:27] aside and then i'm going to write here [58:30] def find [58:32] jobs okay so that way we have one [58:36] function that executes our main program [58:39] and then what i'm going to do here is [58:42] using the logic of if double underscore [58:45] name is double underscore main so that [58:48] way if you want to extend this program [58:51] only if this file is ran directly then [58:54] this function will be executed now if [58:57] you don't know what i said about if [58:59] double underscore name equals double [59:01] underscore main then i have a video that [59:04] explains this condition so you can check [59:06] that out by the suggested link above so [59:10] let's write here if double underscore [59:13] name [59:14] equals to double underscore main inside [59:17] a string and then right here while true [59:21] so i want to run this program forever [59:24] and then i will call the find jobs and [59:28] right after it since i don't want this [59:31] program to be executed like every [59:33] millisecond then i'm going to write here [59:37] time dot sleep so time dot sleep allows [59:41] your program to wait certain amount of [59:44] time that you decide and you can provide [59:47] its argument by seconds so i'm going to [59:50] write here 600 just to make that program [59:53] to run every 10 minutes but you can [59:56] notice how we did not import the time [59:59] library so let's do that by [60:02] import time okay and then this program [60:05] should be okay now to make this more [60:07] dynamic i actually prefer to [60:10] make some variable here that will be [60:12] equal to 10 and then i will just make [60:15] that to be equal to time weight [60:17] multiplied by 60 and right after it we [60:20] can provide some extra information [60:23] excuse me this should be over here and [60:25] we can write here waiting [60:30] let's make it formatted [60:31] then we can write here waiting [60:34] time weight seconds [60:37] and let's write three dots here great so [60:40] this is great so if i'm executing this [60:42] program i expect to see this program [60:44] running every 10 10 minutes so i'm [60:47] inside my command line interface and you [60:50] can see that my directory has been [60:52] already set to the directory where we [60:56] worked so i can go with python and then [60:59] execute the name of the file by calling [61:02] it so it will be main dot pi and then [61:05] once i run that you can see that we [61:08] receive [61:09] this [61:09] output and then i have to provide some [61:12] information that is going to be filtered [61:14] out and then let's write here django [61:17] again and [61:18] you can see that we receive the results [61:22] successfully but more important we see [61:24] that waiting 10 seconds which is not [61:27] great we have to change that to waiting [61:29] 10 minutes because [61:31] we are waiting 10 minutes right but the [61:34] program works great it was just my [61:36] mistake by writing here seconds so it [61:38] should be minutes for sure but i'm not [61:41] going to wait 10 minutes until this [61:43] program is running one more time and so [61:45] i will allow myself to move on to [61:48] writing this information inside file so [61:52] it makes sense to write this kind of [61:55] information in a separated directory so [61:58] i will go inside my web scraping tree [62:01] file i mean folder and then i'm going to [62:04] create here new directory which is going [62:07] to be named as posts and then i'm going [62:11] to [62:11] write here some extra functionality that [62:15] will create [62:16] files i mean text files then [62:19] and then inside each text file i'm going [62:22] to write this exact information so you [62:25] can do that by with open i already show [62:29] you how you can do that in the first [62:31] episode now i know that i don't have any [62:33] separated tutorial about working with [62:35] files in python but you want to consider [62:38] check out my channel maybe i will upload [62:40] very soon so [62:41] you can go here and [62:43] that time i want to put here information [62:46] and i will call my post directory and [62:49] then inside here i have to provide my [62:53] file name that i'm going to create now [62:57] before i move on here i talked about [63:00] changing my for loop here and use the [63:03] enumerate function now enumerate [63:06] function is going to allow us to iterate [63:10] over the index of the jobs list and also [63:15] the job content itself and so i have to [63:19] provide here one more variable like [63:22] index so the index is going to be a kind [63:25] of counter for the job that i'm [63:28] iterating on and then the job variable [63:31] will relate to the job [63:33] beautiful sub object itself and so it [63:36] makes sense to name our files with the [63:39] index of the job that i'm iterating on [63:42] so i will change this into a formatted [63:44] string and then i will write here index [63:47] dot txt so it will be something like the [63:50] following and i expect each my text file [63:52] to be named like [63:54] 0.txt or 1.txt and so on now the second [63:59] argument will be the permission level [64:02] that you want to give when you create or [64:05] open a new file and this time i'm going [64:08] to write here w and that stands for [64:12] writing inside the file and then i have [64:15] to use the as statement and i'm going to [64:18] use the f variable so inside that block [64:22] i can write to a file with the f [64:26] variable and i'm going to go inside my [64:29] with open and i'm going to create [64:31] indentation of the prints and i'm going [64:35] to delete this print line here and it [64:37] makes sense to remove this blank space [64:40] as well and all i have to do here is [64:43] changing this print statement to f dot [64:48] write and then that time i'm not going [64:51] to print the results in the command line [64:54] interface besides i'm going to write the [64:57] information in a new file so i'm going [65:00] to use the combination of alt shift here [65:03] and i'm going to change those entire [65:06] three prints to f dot write okay and [65:10] then i'm going to open the parentheses [65:13] so it will be closed by those and then i [65:17] expect for each job to being [65:21] written inside a file and once i do that [65:24] it might be a great idea to print a [65:27] sentence like file [65:30] saved and then you can provide the name [65:33] of the file as an extra information so i [65:36] will create one more time formatted [65:37] string and then i will relate to that [65:41] index [65:42] variable and now our program is complete [65:46] so let's check it okay let's go back to [65:49] our command line interface and let's [65:52] actually control break this program and [65:55] let's write cls to clear our terminal [65:58] and then i'm going to re-execute my [66:01] program so it will be python [66:04] main dot pi [66:06] and then i'm going to [66:09] execute it so let's see this time i'm [66:12] going to write django as well [66:15] and that time i don't expect to see [66:18] output for the information besides i [66:20] expect to see [66:21] this okay so let's see [66:24] what is inside each of our files so [66:27] let's see what is inside that post [66:29] directory okay so i'm going to go inside [66:32] my c python put [66:34] web scripting tree and then the post [66:36] directory that we created a few minutes [66:38] ago and you can see that inside of that [66:41] we have our text files but if i go here [66:44] inside let's see if the results are okay [66:47] okay so [66:49] i'm not quite satisfied with with that [66:51] because it might be a greater idea to [66:54] see that like [66:56] i mean like this okay so [66:59] you might want to divide those [67:01] information in separated lines but that [67:04] is not going to be complex so [67:06] we just have to go inside our python [67:09] again and then whenever we write to the [67:14] file we have to use that convention [67:17] where you can just jump a line and that [67:20] will be backslash in so when you provide [67:24] backslash n inside a string it is just a [67:28] convention that is going to jump to the [67:31] next line right after it so it will be [67:34] backslash n for the first line and then [67:37] also here and let's run this program one [67:40] more time so i'm just going to break the [67:42] program and re-execute it so that time i [67:45] will write linux and then let's test our [67:48] results one more time so let's go inside [67:51] our [67:52] 19.txt and then you can see that the [67:54] information is right there just like we [67:57] expected okay so this is quite great [68:00] alright guys so i hope you enjoyed this [68:02] entire series and you can find [68:05] everything that we have done here by the [68:08] links in the description of course i [68:10] will provide extra information in my [68:12] website about this series so if you like [68:16] this video consider subscribing and also [68:18] hit the like button i will see you in my [68:21] future uploads