TRANSCRIPTEnglish

Excel for Data Analytics - Full Course for Beginners

10h 59m 44s127,250 words17,114 segmentsEnglish

FULL TRANSCRIPT

0:00

dat nerds welcome to this full course

0:01

tutorial on Excel for data analytics

0:04

this is the course I wish I would have

0:06

had when I first started as a data

0:08

analyst you're going to be working right

0:09

alongside me as we Master how to use a

0:11

spreadsheet starting with the basics of

0:13

functions charts and tables working our

0:15

way up to our first portfolio project

0:17

we'll then shift gears into advanced

0:19

features like pivot tables power query

0:21

and power pivot ultimately building our

0:22

second and final project analyzing real

0:25

world data now to master this tool we're

0:27

not going to go straight for 11 hours

0:29

instead we're going to break it down

0:30

into 10 to 20 minute lessons during this

0:33

we'll have exercises for you to learn

0:35

while doing not just watching followed

0:37

by practice problems to reinforce your

0:39

newly learned skills now Excel is the

0:41

most popular spreadsheet tool in the

0:43

world it's estimated to have over 1

0:45

billion users that's one in eight people

0:48

in the world and for data nerds it's one

0:50

of the most popular skills for data

0:52

analysts coming only behind SQL oh and

0:55

the same can be said for business

0:56

analysts in this Tool's popularity truth

0:58

be told Excel was one of the only skills

1:01

that I knew when I landed my first role

1:03

in data analytics but it was able to

1:06

handle everything thrown at me and so

1:08

I've been cataloging over the years all

1:11

of the most important features to

1:13

perform data analytics and I compiled it

1:15

in this course and this video is for

1:17

absolute beginners you don't need any

1:19

analytic or spreadsheet experience we'll

1:22

be starting with the first half on the

1:23

basic chapters which will build up your

1:25

knowledge on the fundamentals with

1:27

covering which versions of excel you can

1:29

use for the course along with installing

1:31

it then we'll get you familiar with

1:32

working around how to manipulate a

1:34

spreadsheet from there we'll shift into

1:36

practical exercises analyzing data using

1:38

formulas and functions and then

1:40

visualizing it using common charts and

1:41

statistical analysis at the end of the

1:43

basics chapters we'll put your skills to

1:45

the test to build an interactive

1:47

dashboard to predict one salary based on

1:49

job and location for the second half of

1:52

the course we're going to ramp up our

1:53

learnings diving into Advanced

1:54

Analytical features focusing on using

1:56

pivot tables and add-ins to dive quickly

1:59

into Data in sites we'll learn power

2:01

query to connect to a variety of data

2:03

sets and perform ETL or extract

2:05

transform and load finally we'll learn

2:07

data modeling with power pivot and

2:09

perform Advanced calculations with the

2:10

Dax Language by the end of the advanced

2:13

chapters we'll have built a full data

2:15

analytics project analyzing the data

2:17

science job market which you'll be able

2:19

to share this and the previous project

2:21

in order to Showcase your experience

2:23

with analyzing data in Excel now I'm a

2:25

big believer in open- sourcing education

2:28

so this course and all the content

2:30

required to complete the course is

2:31

completely free I not only get you set

2:33

up with Excel but I also provide all the

2:36

different Excel workbooks and sheets

2:38

needed to complete this course with this

2:40

you'll get access to the data sets

2:41

needed to make those final projects and

2:43

even how to share them now unfortunately

2:46

the AdSense Revenue alone from this

2:48

course isn't enough in order to support

2:51

all the different costs associated with

2:53

building this so I have an option for

2:55

those that want to support and help out

2:57

for those that purchase my supporter

2:59

resources you're you're going to get

3:00

access to a lot of features that are

3:02

going to help speed up your learning all

3:03

provided through this custom dashboard

3:05

to track your progress you'll get guided

3:07

practice problems to perform after each

3:08

lesson that will not only provide the

3:10

solution but also walk you through how

3:12

to get it if you get stuck along the way

3:14

you'll have access to a community of

3:16

others in order to jump in and comment

3:18

and ask for help additionally you'll be

3:20

getting my step-by-step instructions

3:22

that walk through each of the lessons as

3:24

I perform it and finally when you

3:25

complete the course I'll email you a

3:27

certificate of completion that you can

3:28

upload to LinkedIn now one quick shout

3:31

out before we jump in and that's to

3:32

Kelly Adams she helped me plan out a lot

3:34

of the different lessons for this course

3:36

along with being the brains behind a lot

3:39

of the different practice problems and

3:41

frankly if I didn't have help I probably

3:42

couldn't have completed this course so

3:44

before we go any further with what we

3:46

need to and actually diving into this

3:48

course we need to First understand what

3:50

is Excel and where the heck it came from

3:54

so in order to understand this we need

3:56

to go back oh a little too far back

4:00

ah just right ancient Babylon when we

4:03

used to trade livestock like it was

4:04

crypto now it's during this time that we

4:06

started recordkeeping and we didn't have

4:09

paper so we used Stone and we partition

4:11

it into rows and columns during the time

4:13

of the Romans they began to perfect this

4:15

even further with accounting eventually

4:17

we get some advancements in technology

4:18

we start getting this on paper this is

4:20

when the term spreadsheets gets the

4:23

introduction this maintained that

4:24

familiar row and column format in order

4:27

to catalog different things spread AC

4:29

across different sheets spread sheet

4:33

fast forward to the 1900s and we pack

4:36

rooms full of underpaid people in order

4:38

to maintain and keep track of all the

4:40

different transactions on paper

4:42

spreadsheets with the Advent of

4:43

computers in the late '70s we started to

4:45

see our first spreadsheet softwares vial

4:48

and Lotus 123 then our boy here decided

4:51

to revolutionize the world little

4:54

bit okay not with that but with this I'm

4:58

Bill Gates chairman of my

5:00

Microsoft in this video you're going to

5:03

see the future since its launch in 1985

5:06

it's been wreaking havoc in the

5:08

spreadsheet software Community

5:09

dominating market share and to continue

5:12

to dominate over the years Microsoft has

5:13

added more and more features it

5:16

initially started out to where you'd

5:17

only be using it for the cells of

5:19

entering different formulas and forming

5:20

quick calculations along with getting

5:22

different charts and Analysis shortly

5:24

thereafter it was upgraded with pivot

5:26

tables and that's my secret weapon to

5:28

quickly analyzing data as I no longer

5:30

have to remember which comes first and

5:32

index and match now VBA or Visual Basic

5:35

for applications was included in the

5:37

mid90s and it's a programming language

5:39

in order for you to automate task in

5:40

Microsoft applications now we're not

5:42

going to waste any time in this course

5:44

learning VBA frankly I feel it's

5:46

outdated you should learn python instead

5:48

and there's newer tools that actually

5:50

automate the process of data analysis

5:53

like powerquery this was first

5:54

introduced as 2010 and then rebranded to

5:57

get and transform and then rebranded

5:59

again to power query sort of similar to

6:02

what Google does with renaming products

6:03

anyway this bad boy is like washing down

6:06

a couple caffeine pills with a shot of

6:08

espresso it can ingest and clean so much

6:10

data in the blink of an eye hardcore

6:12

data nerds call this ETL or extract

6:15

transform and load power pivot was also

6:18

introduced during this time of power

6:20

query and it's like putting your

6:22

spreadsheets on steroids this allows us

6:24

to perform data modeling on data sets

6:27

greater than a million rows greater than

6:29

what Excel actually holding the

6:30

spreadsheets and combined with the power

6:32

of Dax or data analysis Expressions we

6:34

can supercharge our calculations fast

6:37

forward to today and there's been two

6:38

other major features added to excel

6:41

co-pilot which is basically chat GPT

6:43

inside of Microsoft Excel and python

6:46

Excel which is basically python inside

6:48

of excel anyway co-pilot is great wait

6:51

that's a lie so I do believe AI chat

6:53

rots are great at helping us out when we

6:55

get stuck but I don't want you rely on

6:57

that to actually learn this technology

6:59

of Excel and for Python and Excel you

7:01

need to know well python if you don't

7:04

know this yet it's completely useless

7:06

now with all these features it can make

7:08

it seem like Excel is overwhelming which

7:10

I completely get that but when you focus

7:13

on the basics and work from there I

7:15

think it makes a lot easier to learn it

7:17

it's also why this course is almost 11

7:19

hours long all right enough with the

7:21

history lesson let's actually get into

7:22

the course material and what you're

7:24

going to need for this also we're going

7:25

to be going over what data set or what

7:27

data we're going to be analyzing for the

7:29

project for this with the link provided

7:30

below you can navigate to this which is

7:33

the GitHub repo that has all the

7:35

different folders and files needed to

7:37

take the course now don't understand if

7:38

you're not familiar with GitHub we're

7:39

going to walk through this this pane

7:41

here basically outlines all the

7:42

different folders that you have access

7:44

to and if I navigate in something like

7:45

resources I can see I have a data sets

7:47

folder images folder and even a problems

7:49

folder so for those that purchase the

7:51

course practice problems you have access

7:53

to the problems inside of here and

7:55

they're broken down by chapter along

7:56

with the lesson in addition to that

7:58

resources folder you can see numbered

7:59

here we have each of those eight

8:01

chapters and if we navigate into

8:03

something like spreadsheets intro we

8:04

have a workbook for each one of the

8:06

lessons so you want to download this

8:07

file you just navigate to it click the

8:09

three dots and click download but have

8:11

an alternate method coming up in a bit

8:13

inside the workbooks I provide a blank

8:15

template for you to go through and

8:16

actually fill in and we'll be getting to

8:18

what's in this final sheet of actually

8:20

being filled in now as we move into the

8:22

advanced chapters they're going to have

8:24

something like the data sheet or you're

8:25

going to use the data from the data

8:27

sheets in order to do different

8:28

operations and we'll put those in

8:30

different sheets as well so how do we

8:32

get these files well the easiest way is

8:33

to come up here to this code and go to

8:35

download zip with the file downloaded

8:37

all you need is to unzip it and then

8:39

from there it has all the different

8:41

folders with the appropriate workbooks

8:43

inside of them now after going through a

8:44

lesson I then have practice problems for

8:47

those that purchase the course perks to

8:48

go through here's the course dashboard

8:50

that you'll get access to that breaks it

8:52

all down for the problems based on the

8:54

chapter itself and then by the lesson

8:57

and inside of each of these lessons is

8:59

multip multiple different problems for

9:00

you go through and work the other perk

9:02

that you'll receive with those practice

9:03

problems are the course notes these

9:05

break down the concepts in a similar

9:07

format of all the different chapters and

9:09

lesson here's the one on Excel install

9:12

which is going to be what we're covering

9:13

next but it provides all the different

9:15

background on all the different material

9:16

that be covering this and it's in the

9:18

same format that I'm covering it in the

9:21

video so you can follow right along just

9:23

as a reminder there's no requirement to

9:25

purchase these practice problems or

9:26

course notes just helps support me

9:28

anyway what are we actually going to be

9:30

covering in this data analysis that

9:32

we're going to be doing inside of excel

9:33

well you're going to be taking the role

9:35

of a job Seeker in exploring what are

9:37

some of the top paying roles along with

9:39

skills of data nerds for this we're

9:42

going to use the data from my app dat

9:44

nerd. Tech that is collected to this

9:46

point up to 3 million jobs it tells

9:49

based on a job title and also on a

9:52

location what are the top skills and it

9:54

not only tells us the salary of these

9:55

skills for a particular job but also the

9:57

salaries of the jobs themselves now the

9:59

main data set we're going to be using

10:01

for the majority of this course is this

10:03

one here inside the data sets folder of

10:05

data job salary all this data set

10:07

includes over 30,000 job postings from

10:10

2023 and it includes a wealth of

10:12

information such as company name salary

10:14

and location as we go through these

10:16

examples I'm going to be doing it from

10:17

the perspective of a data analyst which

10:19

is their top job in the data set but as

10:22

shown here there's a lot of different

10:23

other job titles that you can check out

10:25

and use as well so feel free to deviate

10:27

additionally I'll be primarily focusing

10:29

on the United States but there's a lot

10:31

of different countries in there as well

10:33

so feel free to plug in your home

10:35

country and analyze this instead now

10:37

with any course you're probably going to

10:38

get stuck along the way and so how do

10:40

you get help for this well I don't

10:42

recommend just jumping into the comment

10:44

section and waiting for somebody to help

10:45

you out instead I recommend using a chat

10:48

bot like chat GPT in it you can provide

10:50

whatever era you're seeing and it will

10:52

help you out and guide you along the way

10:54

on what to do and there's other great

10:55

options as well such as gemini or even

10:57

Claude so feel free to use whichever one

10:59

you're most comfortable with all right

11:00

if you haven't done so already it's your

11:02

turn now to go in and download that

11:04

GitHub repo with all the different

11:05

workbooks needed for this course in the

11:08

next lesson we're going to be getting

11:10

into installing Excel and mainly

11:12

understanding what are the different

11:13

versions that you can actually get with

11:15

Excel and which one you need for the

11:17

course with it I'll see you

11:21

there let's now actually get into

11:23

working with Excel so in this lesson

11:27

we're going to be going through how to

11:28

actually inst install Excel onto your

11:30

computer assuming you don't have it but

11:33

before we get to that for those that

11:35

maybe have Excel or an older version of

11:37

Excel or have different computers we're

11:40

going to actually go through what are

11:41

the preliminary requirements you need to

11:43

have or set up in order to be able to

11:45

have the Excel you need for this

11:49

course now here's a breakdown of the

11:52

different chapters within this course

11:55

that is the rows here and then for the

11:57

columns are the different micro Micosoft

11:59

products that you can get in order to

12:02

have Excel now if you're running Excel

12:05

on a Windows machine either through

12:07

Microsoft 365 Microsoft Office at home

12:09

and student or even an older version of

12:13

excel up to about

12:15

2010 you're going to be fine with

12:17

completing all the different course

12:19

content however if you have the Mac

12:22

version or Mac operating system and

12:24

Excel is installed directly on that

12:26

operating system you're not going to be

12:27

able to complete the Advanced chapter

12:30

specifically on power query and on power

12:32

pivot along with the project and it's

12:35

similar as well for Microsoft 365 online

12:39

as you won't also be able to complete

12:40

the Advanced Data analysis section now

12:43

if you have any of these first three

12:45

versions of excel installed on your

12:47

computer you can skip to the next lesson

12:50

if you want I'm just be going through

12:52

before the install process of breaking

12:55

down each of these different versions so

12:57

you understand your options what you can

13:02

get so let's get into breaking down all

13:05

these different versions available first

13:07

up is Microsoft

13:09

365 now with Microsoft 365 you're going

13:12

to get a host of different Microsoft

13:14

applications not only Excel but also

13:16

things like word PowerPoint and even

13:18

Outlook and there's two major plans I'm

13:20

going to recommend for this either the

13:22

family plan which allows you to give out

13:25

these keys for these different services

13:26

to up to six people or a personal plan

13:29

which allows you to give it to well

13:30

yourself now I do want to call out that

13:32

if you're a college student or maybe you

13:34

work for a big Corporation you may have

13:37

access to a free Microsoft 365 plan so

13:41

if you're in college check with your

13:42

college and if you're working for a

13:43

business check for your business if you

13:44

have access to this so you don't have to

13:46

pay money for it but regardless of that

13:48

if money is an issue Microsoft 365

13:51

family offers this free one-month trial

13:54

which I think you can complete this

13:56

course within a month so technically you

13:58

could do this for free if you don't want

14:00

to get charged you will need to actually

14:02

cancel before the end of that 30 days

14:04

and at that point you'll still have

14:06

Microsoft Excel installed on your

14:08

computer just everything will be in view

14:10

only mode you won't actually be able to

14:12

edit any of the different spreadsheets

14:14

that we've operated on during this

14:16

course let's now move into Microsoft

14:18

Office home and

14:22

student now this bad boy is the

14:24

alternate recommendation I'm going to

14:26

give you if you don't want to pay for a

14:28

Microsoft 3 365 subscription this is

14:31

only a onetime purchase and it gives you

14:34

keys to Microsoft Office so you can

14:36

install all the different Microsoft

14:38

products of excel word and PowerPoint

14:41

onto your computer for the low low price

14:43

of $150 similar to Microsoft 365

14:47

subscription this will not only work on

14:49

a Windows machine but it will also work

14:51

on a Mac machine Let's now move to this

14:54

last option because it's sort of in the

14:55

bundle of it of Microsoft 365 online

15:01

now this version of Microsoft 365 is

15:04

completely free but sort of a catch to

15:07

this here I am on my web browser logged

15:10

into Microsoft 365 online and I have

15:13

access to all the different apps within

15:16

the browser including something like

15:18

Excel so we can go to it now this

15:20

version looks very similar to the

15:22

version that you can actually install

15:23

the applications on your Windows or Mac

15:25

machine there are limitations like a

15:28

disuss before about power query and

15:30

power pivot so you're going to be

15:32

limited if you're trying to follow along

15:34

in this course when we get to those

15:35

Advanced chapters also the layout on the

15:38

web browser version of this app is much

15:40

different from that that's installing

15:42

your computer so I'm not going to be

15:43

providing any support on this course on

15:45

actually actually how to navigate this

15:47

so you're going to have to figure that

15:48

out yourself so we've discussed

15:49

everything except for these Mac versions

15:52

of Microsoft 365 and office so here's a

15:55

quick recap of all the different

15:58

features and cost of the three major

16:01

versions of Microsoft that you can get

16:03

in order to get Excel on your computer

16:05

for this personally I'm using the

16:08

Microsoft 365 family plan because it

16:10

includes all the different features that

16:13

I need and it also I save cost because

16:15

I'm splitting with my brother who now

16:17

that I think of it is actually paying

16:18

for it but it provides everything that I

16:21

need and so it's the one I'm

16:22

recommending for this

16:26

course now before we get into the

16:27

install I want to briefly show what are

16:30

the differences between using Mac with

16:33

Excel installed Vice windows and Excel

16:35

installed on it anyway here's Excel

16:37

installed on my Windows operating system

16:41

and Excel on this operating system is in

16:43

my opinion the flagship product from

16:47

Microsoft so they're investing all of

16:49

their effort and resources into

16:51

designing this application to make it

16:54

the best possible and then from there

16:56

Excel online and then Excel for Mac are

16:59

really just copycats of this anyway the

17:02

two main differences and the problems

17:03

I've run into in the past that Excel for

17:07

Mac doesn't have are in this data tab I

17:10

have a lot of different data sources I

17:12

can choose from and that's specifically

17:14

related to our power query lesson and

17:16

then finally it has power pivot which is

17:20

just completely non-existent on Excel

17:22

for Mac now here I am on a Mac machine

17:25

and we can see that it looks very

17:27

similar to before but there's a lot of

17:30

limitations that we're going to find

17:31

with this specifically going back to

17:33

that power query not a lot of different

17:34

sources you can choose from and then

17:36

yeah Power pivot is just completely

17:38

non-existent you may be like Luke I have

17:41

a Mac machine what do I need to do in

17:43

order to have the most premier version

17:45

of Excel and use for this well for that

17:48

I recommend installing a virtual machine

17:51

and virtual machines like parallels

17:53

shown here allows you to host a

17:57

different operating system on your Mac

18:00

machine this Windows example that I was

18:02

showing earlier if I actually expanded

18:04

out you can see in the background here

18:07

I'm running this on a Mac machine and I

18:11

have full capabilities en able to carry

18:14

out and running Windows on this now I've

18:16

been paying for and using parallels over

18:18

the past 3 years and I can tell you the

18:20

support and the offers from it are

18:22

perfectly fine and I love using it now

18:24

personally I'm using the Parallels

18:27

Desktop Pro Edition but you can get by

18:29

with just using the standard edition now

18:31

they also have this onetime purchase

18:33

that you could do which is 129 but it

18:35

doesn't get any further updates and I

18:37

really like how it actually updates and

18:40

fixes any bugs that may run into now the

18:42

other reason why I like parallels is

18:44

because it has this coherence mode I

18:46

have this blue little icon that I can

18:49

click up at the top to go into coherence

18:51

mode and then wait for it it allows me

18:54

to access any of those windows inside of

18:57

my windows vers virtual machine inside

19:00

of Mac so here is Excel running right

19:02

here inside my Mac and this is not only

19:04

limited to Microsoft Excel but also

19:06

products like powerbi which I'm using

19:08

pretty frequently as a data analyst I

19:10

can also run this into coherence mode

19:12

but enough about

19:14

that now that they got that out of the

19:16

way let's actually get into installing

19:19

Excel via in your Windows machine or on

19:21

your Windows Virtual Machine so the

19:23

first thing we need to do is navigate

19:24

over to

19:26

microsoft.com and I'm going to click up

19:27

here to Microsoft 3 365 we're going to

19:30

be going through setting up the free

19:32

30-day version so I'm going to click

19:33

this of try for free and from there

19:36

start my one month trial it's going to

19:38

ask me to sync my data I'm assume you

19:39

don't have it I'm also going to assume

19:41

you don't have an account so we're going

19:42

to create one I'm going to put in my

19:44

email

19:45

address and then from there create a

19:47

password after providing some personal

19:49

information you're going to need to

19:50

verify your email with the code they

19:51

send you now to be clear this is the

19:54

Microsoft 365 family plan which after

19:57

that 1 month trial it's going to be

19:59

charging you at

20:00

$99 every year so if you're just one

20:03

person and you're trying to switch to

20:05

the personal plane after this you'll

20:06

need to do that at the end or near the

20:08

end of those 30 days from there like any

20:11

company they're going to ask for some

20:12

payment methods I'm going to just go

20:13

ahead with PayPal PayPal's all set go

20:16

ahead and do more paperwork of adding

20:18

Bell and address and with that I can

20:20

start trial and pay later so now that

20:22

I'm logged in I want to install the

20:24

desktop app so it gives me access to

20:26

right here it's going to go ahead and

20:27

begin this it's going to ask if want to

20:29

allow this app to make changes to your

20:30

device yeah I trust them so only took a

20:33

few minutes and all the different

20:34

Microsoft 365 office apps were installed

20:37

so I just come down to the search bar

20:39

down here type in Excel let's pop it

20:42

open make sure it's working and in order

20:44

to get started you need to sign in in

20:46

order to verify that it's your

20:48

subscription so I put in my email and

20:51

password and already forgot my

20:53

password now I'm resetting my password

20:56

and now I'm all set up all right and we

20:58

got agre to some lawyer talk of

21:00

accepting licensing agreements at this

21:02

point I'm pretty worn out of going

21:03

through this process so I'm just going

21:05

to click through everything I'm not

21:06

going to send any optional data

21:08

personally I don't like to do that I

21:10

don't want to personalize right now and

21:13

it looks like I'm finally done all right

21:15

I'm into it and now that we're into

21:17

Excel we can see up here it should have

21:18

your name or your account that you're

21:20

going into and go in here into the blank

21:22

workbook all right so that basically

21:24

concludes this lesson on installing

21:25

Excel I do want to show real quick how

21:27

easy it is to actually cancel your

21:29

membership should you want to go about

21:31

just getting the free version or the

21:33

free 30-day trial and you want to cancel

21:34

it before any if I go back to my account

21:37

I can go in here to manage

21:39

subscriptions and here I'm inside my

21:41

Microsoft account which tells me I'm

21:43

subscribed to Microsoft 365 family I can

21:45

share it with up to zero to five people

21:49

and for that I just click on it and I

21:50

can copy a link and provide it to

21:52

whoever I want to share it with we're

21:54

going to cancel it so we can go to

21:55

manage subscriptions right here and all

21:57

we got to do is click cancel

21:59

subscriptions it's going to have me

22:01

confirm that I do want to cancel this

22:03

family plan makes me scroll all the way

22:05

to the bottom after showing me all these

22:06

different prices that I could get

22:08

instead and I'm going to say yeah I

22:10

don't want my subscription and as I'm

22:13

filming this on August 27th it basically

22:16

says hey you still have access this for

22:18

30 days until September 26th so still

22:21

technically have access to it so if you

22:23

haven't done it already it's your turn

22:24

to now go and install Microsoft Excel

22:26

the one of the options that I've shown

22:28

here in the next chapter we're going to

22:30

get into a spreadsheets intro to get you

22:32

familiar with how to actually use all

22:35

the different functionality or graphical

22:37

unit or interface gooey of excel with

22:39

that see you in the next

22:44

one welcome to this chapter on an intro

22:47

to spreadsheets and this chapter has

22:50

three different lessons in order to

22:53

understand what we're covering those

22:54

three different lessons we need to

22:55

explore some vocabulary with it so let's

22:57

jump into Excel for this lesson we're

22:59

going to be focusing on worksheets and

23:02

that is basically as you can see this

23:03

tab here called sheet one that is how to

23:07

manipulate these different cells within

23:10

this worksheet or also known as a sheet

23:13

in the next lesson we're going to be

23:14

going into workbooks so workbooks

23:17

basically captures either one sheet like

23:19

this one sheet one if I add another one

23:20

sheet two so it encapsulates multiple

23:23

different sheets within this program of

23:25

Excel and then finally in the third

23:27

lesson of this chapter we're going to be

23:28

moving into the ribbon which is up here

23:30

at the top and has a bunch of different

23:32

functionality to extend into those

23:34

spreadsheets along with using this file

23:37

tab up here that has a whole bunch of

23:39

features within it as well now this

23:41

chapter was designed for those that may

23:43

not have experience with using Microsoft

23:46

Excel before so if you don't fall in

23:48

that category as in you've used excel in

23:51

your job and you're pretty familiar with

23:53

all those different features I just

23:54

shown you can feel free to skip this

23:56

chapter and then move into the next one

23:59

on functions along with all those

24:00

different practice problems but if

24:02

you're not comfortable with that stick

24:03

around we're going to get into

24:06

it all right so the first thing you need

24:08

to do is open up that first Excel sheet

24:11

in the files you should have downloaded

24:12

from GitHub on onecore worksheets inside

24:16

of here I have an original sheet that

24:18

allows you to actually go in and fill in

24:20

everything we're going to be doing and

24:21

manipulating during the course of this

24:23

lesson then if you get lost along the

24:25

way or want to peek ahead to see what

24:26

we're actually going to do you can

24:28

actually scroll over here or select the

24:29

final sheet to see that now I want to

24:32

make this as big as possible for you to

24:33

see so I'm going to go ahead and close

24:35

out this ribbon up here and you can just

24:36

do that by double clicking on any one of

24:38

these different items up here and then

24:40

from there I also want to zoom in so I'm

24:41

going to come down here to the bottom

24:43

right and I'm going to just zoom in to

24:45

about 200% and scroll on over now inside

24:48

the spreadsheet it has all these

24:50

different cells and it's organized in a

24:52

manner where it has rows and the rows

24:55

are labeled with numbers 1 2 3 all the

24:58

way down to about a million and then we

25:00

have the columns and the columns are

25:03

alphabetical and they all go all the way

25:05

to where they start duplicating where

25:07

they'll put another letter in front of

25:08

the other and it'll go all the way

25:09

through xfd so let's practice some data

25:12

entry here I have a table we're going to

25:13

be filling in for this lesson basically

25:15

has all the different skills associated

25:18

with it and then I want you to actually

25:20

go through while we're going through

25:21

this and you don't have to provide the

25:23

values I do you can if you want we're

25:25

going to be filling it in based on our

25:27

difficulty when we made have started it

25:28

or level and then filling out some other

25:30

self formulas as we go so we're going to

25:32

start first with Excel and then the

25:35

difficulty so I'm going to select right

25:36

here and I can see which cell is

25:38

selected because it's sort of

25:40

highlighted here on this B and also two

25:42

but also right up here next to this

25:45

formula bar I just call that formula bar

25:48

we can see that we're calling out the

25:50

name of B2 so anytime we reference any

25:54

cells it first references the column

25:56

letter and then the row number so in

25:59

this case I'm selected in C7 so I'm

26:01

going to go ahead and give this a number

26:03

I'm going to say four for myself as you

26:05

notice I I just put it right in the Box

26:07

alternatively I can also select the cell

26:10

I want to go to and then come up here

26:11

into the formula bar press what I want

26:14

so I want five for Python and go from

26:16

there whenever I press enter it then

26:18

goes down to the next cell so

26:20

technically I could just go through and

26:21

enter this all in using my keyboard and

26:24

I don't have to click or move manipulate

26:26

at all except to select the cell that

26:28

wanted so those were all numerical

26:30

values when we move into the skill known

26:33

on whether we know it or not we want to

26:35

put in whether it's known or not we want

26:36

to put true or false this is known as a

26:38

Boolean value so typing in something

26:41

like true I can see when I press enter

26:44

it actually updates to be all caps for

26:47

this Tru so it recognizes the data type

26:49

of this as Boolean now if you're taking

26:51

this course you probably don't know

26:52

Excel so we're going to put in false

26:54

instead now say I want to update the

26:56

rest of these for false false I can yeah

26:59

go through and actually type it up or I

27:01

can select this lower right hand corner

27:04

of cell C2 and now I can drag these

27:07

values down and it will autofill it in

27:11

now autofills not just limited to

27:14

Boolean values let's say I had something

27:16

like Luke I could put that here and just

27:19

drag it down it's going to fill in Luke

27:21

all the way through here a cool feature

27:23

about Excel is say I have something like

27:25

one and then two I could select both of

27:28

these cells and then when I drag it down

27:30

it's going to actually fill in three or

27:32

four now autofill can also throw you off

27:35

especially for dates so let's say we're

27:37

filling in when we're starting Excel

27:39

which is we'll put in for the today's

27:40

date in my case it's August 27

27:44

20124 I'm going to go ahead press enter

27:47

to save that in it automatically updates

27:48

to this formatting here in America if in

27:51

Europe you may see the month in a

27:52

different location anyway if I select

27:55

this and actually drag down what you'll

27:57

see is is it will do that auto fill in

28:00

but it's not going to keep that same day

28:02

per it assumes we want to increment by

28:04

one day now specifically with dates if I

28:07

want to change the format I can actually

28:09

come up here and I'll expand out this

28:11

home ribbon again and right now it's Rec

28:14

recognizing that the number is of date

28:18

and for date I have a few different

28:19

options I can do short date which is

28:20

shown here or even something like long

28:22

date I can also go even further which

28:25

we'll explore as we get further into

28:27

this course into this more number

28:28

formats and date actually has a whole

28:30

bunch of other different options that we

28:32

can choose from but for right now we're

28:34

just going to keep it this simple date

28:35

format and I'm going to click okay now

28:38

assuming you haven't started any of

28:39

these I'm going to go ahead and actually

28:40

just select all the different cells that

28:42

I want and if you were to press delete

28:45

it's only going to delete that top cell

28:47

and that's sort of annoying because I

28:49

want to delete all these different cells

28:52

instead what I'm going to do if I'm on a

28:53

Windows machine I'm G to press delete or

28:55

in my case I'm using a Mac Windows VM

28:58

I'm press function delete and it's going

29:00

to delete all the different content

29:01

right I'm also going to go ahead while

29:03

I'm here delete all that different

29:04

content down there we don't need it now

29:05

we're going to move on to level type of

29:07

diet we're going to put into this is

29:08

text so in the case of excel you're

29:10

probably a beginner so I'll put in

29:12

beginner and then if I want to I can go

29:14

through and fill out different levels

29:15

for each of these so python Advanced RBI

29:18

Advanced and so on for all these now one

29:21

thing to notice real quick is for the

29:23

date it does specify in here under this

29:26

home ribbon that it is a date but all

29:28

these other one it just characterizes as

29:30

general which is perfectly fine now for

29:33

these other options down here let's go

29:35

ahead and say I wanted to put in

29:36

beginner for all the rest of these can't

29:38

necessarily drag and drop this but what

29:39

I can do is I can actually copy it

29:41

specifically I could right click the

29:42

cell and come up here and copy it but I

29:44

don't recommend that also over here on

29:46

the home menu they have an option as

29:48

well to copy or even cut something so I

29:51

can select something like copy as well

29:53

and it's going to put these marching

29:55

ants as they call it around the cell to

29:57

tell you that hey it's actually selected

30:00

and then if I wanted to paste it I go

30:02

ahead and select down here and I could

30:04

paste it down below that's not what we

30:05

want to do I don't like going through

30:06

and actually selecting all these

30:07

different buttons I want to minimize it

30:09

as much as possible and I want to use

30:11

shortcuts so in order to stop these

30:13

marching ants I can go ahead and press

30:15

escape and I'll select the cell that I

30:17

want to copy and from there I'll press

30:20

contrl C and that copies it and then I

30:23

can go ahead and paste it below by

30:25

selecting the cell that I want and

30:27

pressing control contr V now you'll be

30:29

noticing that when I'm going through

30:31

this I have these shortcuts peing right

30:33

here next to me on the screen so you'll

30:34

be able to follow along as well as I'm

30:36

using these shortcuts the other option

30:38

is I could cut this so I could press crl

30:41

X and then paste it in here crl V but

30:45

this is going to go ahead and take this

30:46

value out of here we don't want to

30:47

necessarily do that so I'll just copy

30:49

this again crl C and then paste it right

30:51

above here contrl V shortcuts are going

30:54

to be a big timesaver and we're going to

30:56

be using them a lot throughout this

30:58

course in order to save you time and

30:59

having you to go back to your mouse in

31:01

order to manipulate it and select the

31:03

different

31:05

cells all right so let's step this up a

31:07

notch and we're now going to get into

31:09

using formulas and formulas are denoted

31:13

by whenever we go into a cell like

31:15

difficulty here which we want it to be

31:17

on a 1 to 10 scale we denote formulas by

31:21

an equal sign and in this case we want

31:24

the difficulty to be on a 10-point scale

31:26

basically transition from that 5 point

31:28

scale so we need to multiply it times

31:29

two so we could do something like 4 * 2

31:34

and I press enter and it's going to give

31:36

me as I expect eight but I actually

31:38

don't recommend hardcoding values that

31:41

are already inside of excel here

31:44

specifically this four so instead of

31:46

this I'm going to remove this and I can

31:50

either type in the cell coordinates of

31:52

the cell so I could type in

31:54

B2 and as you notice it's highlighting

31:57

one the B2 is blue but then the cell B2

32:01

is highlighted in blue alternatively I

32:03

can have an equal sign here and just go

32:05

over and actually select it as well

32:07

whenever I press enter it's going to go

32:09

ahead and say a it's four now now that

32:12

I'm referencing that four I want to say

32:14

that this is 4 * 2 pressing enter we

32:20

have 8 once again we're going to use

32:22

that power of autofill so I can select

32:24

that cell of F2 and now drag it down and

32:28

what's going to be pretty interesting

32:30

about this is the two as denoted in the

32:33

formula bar and actually whenever I

32:35

click into it as well the two Remains

32:37

the Same but autofill automatically

32:40

knows to adjust the formula or the cell

32:44

coordinates for the next cell Down based

32:47

on how I did that autofill just to show

32:50

this as well I could say hey let's equal

32:52

this to B6 right below it and then if I

32:55

were to drag this over it's going going

32:58

to then put in C6 D6 E6 then F6 so

33:02

pretty cool I'm going go ahead and

33:03

delete this now the last column we're

33:05

going to be filling in is skill and

33:07

level we're also be using a formula for

33:09

this and we'll set this equal to this

33:11

skill thing and also this level so I'll

33:15

start by putting in an equal sign and

33:17

then it's not on the screen right now

33:19

but I know it's in b or sorry A2 and I

33:22

can see that selected by scrolling over

33:25

here now how am I going to get in that

33:28

F2 well I can do an Amper sand now and

33:32

from there I'll put in F2 and it has

33:35

this selected as well pressing enter

33:37

ended up in the wrong one sorry about

33:38

that should have been E2 and now I have

33:41

Excel beginner but there's no space in

33:43

between there this is sort of hard to

33:45

read so what I can do is actually

33:47

manipulate this to include another Amper

33:50

sand and then in between this I'm going

33:53

to put quotes and this is hey insert

33:55

this text character in between it

33:57

specifically I want to have a space then

33:59

a dash and then another space and then

34:02

press enter now if I tried to do this

34:05

without the quote if I just did this and

34:08

press enter I'm going to get a typo in

34:11

my formula you have to actually put

34:13

those quotes around to show that it's

34:14

text and it's trying to correct it for

34:17

some minus sign I don't really like how

34:19

it's doing it oh my gosh it's freaking

34:21

out now anyway I put the quotes back in

34:23

there pressing enter boom we have it and

34:26

like before I'm going to just do

34:28

autofill to fill all those

34:31

in so let's zoom out a little bit cuz

34:34

we're going to be now be working with

34:36

ranges which is a collection of cells

34:39

now if you notice whenever I select in

34:42

this case I'm selecting B2 it says B2 up

34:44

the top but if I go to select more of

34:46

this it will actually call out that five

34:49

r or five rows by two c or two columns

34:52

and then when I Let Go it just goes back

34:54

to B2 anyway ranges are a selection of

34:57

multiple of cells so if I come over here

35:00

to i1 put it in equal sign and then if I

35:02

want to say copy this entire range I can

35:06

go ahead and select this all so it's

35:09

saying it's A1 colon G6 so start the

35:13

upper left hand corner of A1 and the

35:14

bottom right hand corner of G6 now this

35:17

is pretty cool there's a new feature of

35:19

excel of dynamic rages it's going to go

35:21

ahead and fill this in there's only one

35:24

formula in here of that A1 through j6

35:26

but you see that has this Shadow border

35:29

around here that's showing that this

35:31

dynamic range is now filling in for all

35:34

these different things and if we look at

35:35

the formula bar it's sort of gray out

35:37

here too for it only at the very

35:40

beginning does it show that A1 and G6

35:43

and then you could manipulate it so if I

35:44

wanted to I could change it to G5 and it

35:46

would just go down a row now we're not

35:48

limited to just that we could in fact

35:51

select an entire column so in this case

35:53

I'll put an equal sign and let's say I

35:56

want to do the the full column of column

35:59

a right here I can select up here a it's

36:02

going to select all the way down and if

36:05

we go over to the formula bar itself we

36:07

can see that it's saying a colon a that

36:10

means all the contents of column A are

36:12

going to be included in this and from

36:14

there it's putting a copy putting all

36:16

these different things and then when

36:17

there's not a value in it because it's a

36:19

copy similar to over here for these

36:21

dates of zero we're going to see Zero in

36:24

all these different values all the way

36:25

down now similarly I can also do a copy

36:28

of a row so in this case if I wanted to

36:30

or multiple rows if I wanted to do rows

36:33

five and six I could press enter going

36:35

to get an erir with this though and that

36:37

has to do with this Q column right here

36:40

that we're copy and pasting here so I'm

36:41

going to go ahead and delete that real

36:43

quick get rid of it and now we have that

36:47

rows five and six duplicated below along

36:50

with that shadow around it and all there

36:53

now these ranges are going to save us a

36:55

lot of time later so I'm going to go

36:56

ahead and delete this right now I don't

36:57

want any of that as later on when we get

36:59

into actually using functions within

37:02

formulas I can use something like the

37:04

average function put in a range in here

37:08

so it selects all of it and then get the

37:10

average of it in this case now one last

37:13

thing to note on this before we wrap up

37:16

here on how to save this is you may have

37:18

noticed that this date started over here

37:20

is a number and that's because that's

37:24

how Excel stores dates with within this

37:28

spreadsheet right here so if I actually

37:29

click on it go back up to home right now

37:32

it's St storing it under the format of

37:35

General right now so if I were to make

37:38

this into an actual date we can see that

37:40

it is in fact 827 2024 now just some fun

37:44

little trivia if I were to put in number

37:46

one and transition it to a date so

37:51

coming up here and selecting date that

37:54

first date starts at January 1st 19900

37:58

and then they move on the numbers from

38:00

there all right last thing we need to do

38:02

is now save the work that you just

38:04

completed with this you can do this

38:06

multiple different ways we can come up

38:07

here to the top of your Excel workbook

38:11

right here and click save you can also

38:13

as shown you can use contrs

38:15

alternatively you can come over here to

38:17

the file menu and then come on down to

38:20

save or save as and then if you wanted

38:23

to you can specify the location where

38:26

you actually want to save your file and

38:28

save it there now you do have the option

38:31

which I highly recommend if you're

38:32

working with real world files you want

38:35

to actually save them to save this

38:37

autosave feature the one caveat to this

38:41

is that your files have to be stored on

38:44

one drive right now with the plan that I

38:46

have I can store about one terabyte of

38:48

files on there so if you'd like to do

38:50

that feel free to transition your files

38:52

there I'm not going to um and I won't

38:54

have Auto saave on for this but for very

38:56

important files definitely do have

38:58

autosave set up all right for those that

39:00

have purchased the practice problems and

39:02

notes you have some practice problems to

39:05

go through and get even more familiar

39:06

with manipulating cells inside of a

39:08

spreadsheet after that we're going to be

39:10

going into manipulating a workbook with

39:12

that see you in the next

39:17

one all right we're going to be

39:18

continuing on with this spreadsheets

39:20

intro focusing now on workbooks so

39:24

previously we were focusing on

39:25

worksheets which are a sheet inside of a

39:28

workbook now we're going to be focusing

39:29

on manipulating and moving data between

39:36

workbooks now for this I don't want you

39:38

immediately jumping into that 2or

39:41

workbooks Excel file this really just

39:43

has all the answers in it it doesn't

39:45

have really what we need for it instead

39:47

we're going to be starting with a new

39:49

notebook and instead importing in some

39:52

data so specifically if we go into this

39:54

folder of zore resource

39:58

into data sets we have this one Excel

40:01

file called Data job salary monthly now

40:06

this is similar to the data that we're

40:08

going to be using for the remainder of

40:09

the course we're actually going to use

40:10

another Excel sheet but this one here is

40:12

pretty neat because it's broken up by

40:14

months into different sheets so all the

40:16

job postings for January are in this

40:18

sheet called Jan and so on for February

40:22

and so on for March so what we're going

40:24

to be doing in this lesson is moving we

40:26

want to just evaluate the January data

40:28

move that into a new workbook so to get

40:31

a new workbook as easy as possible we're

40:33

going to come over here to the file menu

40:35

I'm just going go to new and click blank

40:38

workbook now here I have that new Bo

40:40

notebook right now it's titled book two

40:42

because it hasn't been saved anyway

40:44

going back to that file menu just to

40:45

show you I have different options I can

40:48

get a new notebook so we went into new

40:50

and just selected uh a blank workbook

40:53

also we could use this Home tab and

40:56

select a bank blank workbook based on

40:58

that also have a bunch of different

40:59

tutorials you can check out also we have

41:01

this open tab right here which allows

41:04

you on the left hand side to select a

41:06

location like this PC or even browse

41:09

different locations in your file system

41:11

but frankly I'm using more often than

41:13

not over here on the right hand side

41:16

this right here where this shows a past

41:18

history of Excel files I've worked with

41:19

so I can go through and actually select

41:21

an Excel file pretty easily we're going

41:23

to explore more about this file menu

41:25

more in a bit let's get moving some data

41:28

first now before we get into copying

41:30

this data into the new workbook itself I

41:34

want to actually just copy it within its

41:38

own workbook so if we noce some controls

41:40

down here at the bottom we have all the

41:42

different Sheets if we want to add

41:43

another sheet which I want to copy it to

41:45

I'm just going to add this in right here

41:47

and I'm going to call this Jan copy

41:50

press enter and that's new sheet and I

41:52

and I added that by just double clicking

41:54

in there and then allowing it to addit

41:56

addition I can rightclick it and I can

41:59

do things like rename it and that will

42:01

do the same thing now there's also some

42:03

controls around here you notice there's

42:05

some arrows on right here and what that

42:08

does is just Scrolls all the way over or

42:11

incrementally over so I can see all the

42:13

different sheets in this case there's

42:14

more sheets than I'd actually see in one

42:17

view then we have the scroll bar over on

42:19

the right hand side this is actually

42:20

just controlling the scroll area within

42:22

our new sheet of Jan copy so previously

42:26

we we saw how we can copy ranges using a

42:30

formula in this case I'm entering equal

42:31

to and then I'm just going to select

42:33

this range right here press enter and I

42:37

can get it inserted in and then actually

42:39

looking at the formula it's just equal

42:40

to

42:41

J1 uh colon p8 and this has its range

42:45

right there all right so I want to get

42:47

the contents into this sheet so I'm

42:49

going to start by putting an equal sign

42:51

and then I'm I go over to that Jan sheet

42:54

and when I go over here you're going to

42:55

notice that now next to that equal sign

42:58

I have Jan the name of the sheet and an

43:00

exclamation point this is identifying

43:02

the sheet and I want all this different

43:04

items so as I go to select it all you

43:08

can see that it's updating in the

43:09

formula bar right now I have A1 through

43:11

P2 selected but I actually want to

43:13

select everything in this sheet and

43:16

we're about at 3,000 rows and right now

43:20

I'm only about 500 of those this is

43:22

going to take forever so I don't

43:23

recommend necessarily doing this type of

43:26

method to try to select all your data so

43:28

I'm going to go ahead and Escape out of

43:30

this and go back to where we were at the

43:33

Gen copy instead once again I'm going to

43:35

press that equal sign go back to that

43:37

Jan sheet right up in the form bar once

43:40

again I can see that it has the Jan and

43:41

the exclamation point I'm going to

43:43

select A1 to start with and I'm going to

43:46

press the shortcut contrl shift and then

43:49

the right arrow key and now all the top

43:52

row is selected from here I'm going to

43:55

continue to hold control shift and press

43:57

control shift down and it's going to

44:00

select all the different arrows so as we

44:03

can see up here A1 to P 3103 scrolling

44:06

down we don't have any more data now all

44:09

I have to do is press enter and I did

44:11

this to basically show the nomenclature

44:14

now so now we're not only selecting a

44:16

range but we're also selecting a range

44:18

from a different sheet and this is how

44:21

Excel does the nclat or the formula

44:23

necessary to make this work and once

44:25

again this is a dynamic range appearing

44:28

inside of here but we really want to put

44:31

it inside of here into this new workbook

44:35

so what I'm going to do is I'm going to

44:37

actually delete this sheet right here

44:39

because we don't need this copy sheet in

44:40

here I don't want to actually manipulate

44:42

my data at all going to right click it

44:43

and select delete it's going to prompt

44:46

me any time that hey you're going to

44:48

permanently delete a sheet do you want

44:49

to continue yeah I want to continue now

44:52

once again I'm going to go back to that

44:55

original blank sheet that we have I want

44:57

to put it into here so I'm actually

44:58

going to name this one Jan and then

45:02

we'll call this one formula CU

45:03

technically it was a formula not a copy

45:04

I don't know why I did copy before

45:06

anyway back into A1 once again I'll

45:08

press that equal sign and then going

45:10

back to that other workbook I will

45:13

select it the first cell in there which

45:16

is actually A1 and now we can see we

45:19

have in the formul of the bar which is

45:21

actually the front of the bar which is

45:23

sort of strange in the other sheet that

45:25

our other workbook that we work with we

45:27

have inside of brackets the Excel file

45:30

name the sheet that we're in and then

45:33

the actual uh cell range of A1 we have

45:37

dollar signs around this this locks the

45:39

references of it which we're going to go

45:41

into more detail on but the main thing

45:42

to understand is this has A1 selector

45:44

right now but we want to select all this

45:45

data so that shortcut of control shift

45:48

right select all the different columns

45:50

and then control shift down okay it's

45:53

all selected I'm going to go ahead and

45:54

press enter and it's going to take me

45:55

back to my original workbook that I was

45:58

trying to work with this now that was

46:00

using formulas to copy this data we're

46:03

going to explore two more options the

46:05

second one is going to be somewhat

46:07

familiar using copy and paste so I'm

46:09

going to create this new sheet I'm going

46:10

to call it Jan copy and

46:13

paste from here I'm going to go back to

46:15

our original data that we have and since

46:18

we're at the bottom of the sheet I'm

46:19

just going to select the bottom right

46:21

hand corner press control shift left now

46:24

if you noticed it went and stopped

46:26

stopped at this Blank cell right here

46:29

which isn't a big deal I'll press it one

46:31

more time it'll go to the next cell over

46:33

that actually has a value in it and then

46:35

once again it's going to go all the way

46:38

to the end of a

46:40

3103 so basically if there's any blanks

46:42

while you're trying to do this it's

46:43

going to stop at those values there okay

46:46

and then from there I'm going to press

46:47

control shift up and as we're saying

46:50

it's going to stop at every different

46:51

Blank cell along the way this is going

46:53

to take forever unfortunately I don't

46:55

recommend you actually do that ever

46:56

again

46:57

instead start up at the top left and do

46:59

the control shift over to the right and

47:02

then all the way down in order to select

47:04

all the cells now like we did before we

47:06

want to copy it I could either use this

47:07

up at the top in the home ribbon right

47:10

here I could actually select copy or the

47:13

shortcut which I'm going to recommend of

47:15

contrl c and from there going back into

47:17

our new workbook selecting cell A1 and

47:21

then using contrl V and pasting all this

47:24

data in now moving on to the third

47:25

example which is is actually the one I

47:28

recommend you do anytime you need to

47:29

move sheets of data basically in both of

47:32

those previous approaches you could go

47:34

about missing getting data to move over

47:37

so I don't really recommend doing that

47:39

instead I would come down here to the

47:41

Jan sheet write click it and select move

47:45

or copy so we have this new window that

47:47

pops up and it has two book right now it

47:50

has this Excel sheet selected of data

47:52

job salary monthly we don't want to move

47:54

to that we want to move to book two we

47:57

also move to a new book but book two is

47:59

open that's what we've been working in

48:00

that's what we're going move to okay we

48:02

can see we have the different sheets

48:04

that we've already made in there and it

48:06

says in this dialogue this is where you

48:08

want to put this before this sheet and

48:11

we want at the end so we'll select move

48:13

to end now we don't want to take this

48:17

sheet Jan out of here we just want a

48:19

copy of it so we're going to select this

48:21

create a copy and then click okay now JN

48:24

has moved over here but I do want to

48:26

actually differentiate this so I'm going

48:28

to

48:29

put mover

48:33

copy now in the next lesson we're going

48:35

to be exploring more about the ribbon

48:36

but we're going to be exploring now more

48:38

about the file menu or also known as

48:42

backstage view we've gone through this

48:44

home new and open we also have this here

48:47

for share this is available for well if

48:50

you're sharing it via one drive this

48:52

makes it super easy to share with your

48:54

co-workers we're not going to go into a

48:56

lot of detail but this is a great option

48:58

if you're working in one drive and you

48:59

want to actually collaborate with other

49:01

co-workers you can work on Excel files

49:03

at the same time moving down to the list

49:05

here we also have get add-ins and we're

49:08

going to be actually looking at

49:10

different addins we can use in the

49:13

advanced chapters whenever we get to

49:15

that so we working with some addins with

49:17

that next up is info which has over here

49:19

on the right hand side some key metadata

49:22

about our Excel file itself then if we

49:25

want get into actually protecting our

49:27

workbook which we're going to cover in a

49:29

few chapters down the road you can get

49:31

into actually doing that the only other

49:33

thing that I find myself doing from time

49:34

to time in this section is on version

49:36

history once again this requires you to

49:37

be using one drive for it but you could

49:40

go back and revert back into a previous

49:42

version that you work with so it's great

49:44

for that now moving into save or even

49:48

save as since we haven't saved yes

49:50

they're both the same right here I'm

49:51

going to go ahead and save this but I

49:53

don't want to save this on one drive

49:54

personal I'm just going to shave this on

49:56

my desktop so I'll come and select

49:58

desktop and then I'll name this two

50:01

workbooks and save it now Beyond save as

50:05

we also have things like print which I

50:07

really don't find myself doing that too

50:09

often should be sending an electronic

50:11

version export if I wanted a pdf version

50:14

of something and then finally close as

50:16

well same thing as this x up here just a

50:18

x out of it and there's two more areas

50:20

down here that I want to call out and

50:22

that's a count and that allows you to

50:24

actually see behind the scenes of what

50:26

going on with your Microsoft account and

50:29

this is generic to all the different

50:31

Microsoft products that you have so not

50:34

just Microsoft Excel as you can see from

50:36

my information I'm actually inside the

50:38

Microsoft 365 Insider program so I get a

50:41

lot of access to Insider features get to

50:45

experiment with new stuff before any

50:47

other people do anyway this is where you

50:48

want to come anytime you want to make

50:50

sure that you have your Microsoft

50:52

products up to dat I have automatic

50:55

updates available so even I'm I check to

50:57

update now it's going to tell me hey I'm

50:59

up to date the other thing to note on

51:00

this is the different office themes that

51:02

you have on this I'm actually going to

51:03

change this right now to use system

51:06

settings which on my Mac I use dark

51:09

theme so it's going to go to that last

51:11

two options are hting down here behind

51:12

more I have feedback so if I wanted to

51:15

give feedback to this product i'

51:17

probably go to something like X or

51:18

Twitter instead and then finally options

51:21

we'll be getting to options later on in

51:23

this but this allows a very much more

51:27

advanced features that we can actually

51:28

go in and customize using this menu

51:31

especially whenever we get into add-ins

51:33

we're going to be doing that from here

51:35

all right so now you become an expert at

51:37

how to manipulate different spreadsheets

51:38

or sheets along with manipulating them

51:42

between different workbooks in the next

51:45

lesson we're going to be going into this

51:47

ribbon up here and actually exploring

51:50

everything a little bit further and

51:51

getting a sneak peek into each one of

51:53

these for those that purchase the

51:55

practice problems and course notes you

51:57

have some practice problems to go

51:58

through now and experiment working with

52:01

different workbooks with that see you in

52:03

the next one where we get into the

52:04

ribbon see you

52:08

there all right this final lesson of the

52:11

spreadsheets intro we're going to be

52:13

getting into the ribbon inside of Excel

52:17

and better understanding what are all

52:19

the different tabs and what are the

52:20

capabilities by doing some simple

52:22

exercises for this we're going to

52:24

continue to be analyzing that January

52:26

data set that we worked from the last

52:27

Lon and we're going to actually get into

52:29

actually performing some data analysis

52:31

with it so for this lesson you can open

52:34

and use that ribbon menu Excel file

52:37

which I have right here and all the data

52:40

that we're going to be working with are

52:41

that January data is in this data tab

52:43

along with all the examples and all the

52:45

different tabs but I don't need this I'm

52:47

not going to work with this so I'm going

52:48

to close this out instead I'm going to

52:50

be working off where we left from last

52:52

time in that two workbooks where we

52:55

actually moved over that January data

52:57

set now quick disclaimer for any of

52:59

these files that you're opening up if

53:01

you're noticing the security warning of

53:03

automatic updates of links have been

53:05

disabled can go ahead and just enable

53:08

the content and then click right here on

53:10

do not ask me again for network files

53:13

and select yes cuz I want to make it a

53:15

trusted document now if you're getting

53:17

any of these areas that the file has

53:19

been moved renamed or deleted cuz mainly

53:21

you have it in a different location of

53:23

what I had it here's actually the

53:25

address of the file that I'm using I

53:28

open it up anyway this is the actual

53:30

address of where the file is anyway you

53:32

can come down here and select these

53:33

three dots on the file in question and

53:35

just select change Source go into browse

53:40

and then from there inside the actual

53:42

file itself select where this is so in

53:45

this case it's looking for that data set

53:46

file with the data job salary monthly

53:48

I'm going to select it select okay and

53:50

then it's prompting me now that this

53:52

link workbook hasn't been refreshed want

53:54

to and go ahead and refresh it and it's

53:55

going to update it all right close out

53:57

of this now anyway that was all s silly

53:59

because I'm going to go ahead and delete

54:02

this formula one right here and also

54:04

this copy and paste tab right here we

54:07

only want to keep the Mover copy which

54:09

is the actual sheet that we moved over

54:12

that has all the data for this lesson

54:15

okay I'm just going to rename the sheet

54:19

data so let's dive into this Home tab

54:21

and this thing has a lot to do with

54:24

formatting the text and how things

54:27

appear within the spreadsheet for

54:29

example I can select all these top rows

54:32

right here so basically A1 all the way

54:34

to P1 I can change this font size to

54:37

something like 12 for the fill color or

54:39

the background color I can change it to

54:41

something like a light gray right now it

54:43

looks like it's already bold I could

54:45

turn it off or turn it back on

54:46

inspecting all these different columns I

54:48

can see that some of it is hidden

54:50

especially here this date column I can

54:52

see inside of here this is the actual

54:54

value but whenever we actually look look

54:56

at it from afar like it it has these

54:58

Amper sand signs so double clicking on

55:01

the edge of that H column right here it

55:04

actually expands out and moves it where

55:05

it needs to go you can actually do this

55:07

for all the column by just selecting all

55:09

of them and then double clicking that

55:12

last one and then that expands it all

55:14

the way we can see that that last column

55:16

is well super long so it has all the

55:17

different skills typically these titles

55:19

up the top I'd like to maintain centered

55:22

so that way I know that it's a title but

55:23

I could move it to either side also so I

55:26

can move it up or down if I wanted to

55:28

but we'll leave it right there in the

55:29

center as well getting into the number

55:31

formatting itself I can actually go and

55:34

select something like job post to date

55:35

it's going to select that whole column

55:37

if I wanted to I can turn this into a

55:39

date so in our case I want to do a short

55:42

date now other columns I would want to

55:44

format are these salary year average and

55:48

also salary hour average so besides just

55:52

clicking here I can also just select

55:54

that hey I want to use this as an

55:56

accounting number format and it's going

55:58

to automatically put these decimal

55:59

places at the end two decimal places

56:01

since we're in the 100 thousands I don't

56:03

really care about so I'm actually going

56:04

to remove them by saying decrease

56:06

decimal I'm going to do that twice now

56:08

for something like salary hour average

56:10

I'm going to also convert this to a

56:12

currency but for these these may have

56:14

two decimal places of values included in

56:17

it so I'm going to leave it now so for

56:19

the Styles and cells portion we're going

56:21

to be getting into this more especially

56:23

into conditional formatting in the

56:25

spreadsheets Advanced chapter and

56:27

chapter 4 so we'll save that for then

56:29

the next thing I want to do is get into

56:31

this editing and this is a pretty

56:33

powerful feature we can actually sort

56:36

and filter our data if we wanted to so

56:39

what I'm going to do is actually select

56:41

all these cells from P1 all the way to

56:44

A1 and then come in here inside of

56:47

editing select sord and filter and apply

56:50

this filter so let's actually get into

56:52

filtering this data specifically I'm

56:54

wanting to investigate

56:57

jobs or data analyst jobs in the United

56:59

States and specifically full-time jobs

57:02

we're going to be looking at the salary

57:03

data for this so I want to filter it

57:05

down for it so I'm going to select here

57:08

I'm going to unclick select all and

57:10

select data analyst and now it's going

57:12

to filter for all the different data

57:14

analyst roles there nothing else that's

57:15

not there additionally that job schedule

57:18

type I want to be looking at full-time

57:20

roles only I don't want to include any

57:22

other ones so I'll select fulltime I

57:24

want the country I don't want to be

57:26

skewed by any other countries I live in

57:28

the United States so I'm going to then

57:30

select United States and then finally I

57:33

only want to look at the salary or the

57:36

yearly salary data so I can actually

57:39

come over here to the salary rate and

57:40

select here I only want to look at the

57:43

year data okay so now this has

57:45

everything in it that I want we're going

57:46

to get to analyzing and visualizing this

57:48

in a second before that I want to talk

57:50

about two other features addins which we

57:52

talked about before on how you access to

57:53

the file menu you can get to addin via

57:56

this and finally analyze data which in

58:01

my opinion isn't that strong of a

58:03

feature this tab uses a little bit of

58:06

artificial intelligence behind the

58:08

scenes for you to investigate so it'll

58:10

actually provide you different

58:11

visualizations that you could actually

58:13

visualize out of your data and or even

58:16

you can go as far as asking a question

58:19

about maybe you want to see hey the

58:20

distribution of salary rate or something

58:22

like that all you have to do is come

58:24

down here and then insert in the chart

58:26

that you want to insert in I'm going to

58:28

close out of this now we can see that

58:30

we've made this salary distribution um

58:33

that we maybe want to visualize overall

58:35

though I find that this analyzed data is

58:37

pretty hit or miss so I'm not using it

58:40

very

58:43

often now the insert tab is where I

58:45

spend the second most of my time after

58:47

the Home tab they conveniently put in

58:48

the correct order there's three major

58:50

use cases that I'm using out of this in

58:53

chapter 4 on the advanced use of spread

58:56

sheets we're going to be going into

58:57

tables and then in chapter five we're

59:00

going to be going into pivot tables but

59:03

even closer to that in chapter 3 we're

59:05

going to be going all into depth on how

59:07

to use these charts but let's get a

59:09

sneak peek into this specifically

59:11

remember we filtered this table down to

59:13

data analyst jobs in the United States

59:16

and specifically full-time roles we want

59:18

to visualize this salary year average

59:22

column so with column M selected I come

59:25

up here to recommend charts and it's

59:27

going to give me a visualization of some

59:29

well recommended charts now there's only

59:30

four here I can also select this other

59:32

tab up here on all chart and actually

59:35

try to see hey what would this look like

59:37

maybe in a pie chart or a bar chart

59:40

anyway I want this in a histogram which

59:41

we're going to go into more detail on

59:42

how to read this later what all have to

59:44

do is just come in here double click it

59:46

it'll insert it in now notice how

59:49

whenever this was created we now have

59:52

new tabs appear inside of here

59:55

specifically with this selected we have

59:56

this chart design and format tab if I

59:58

select off of it those tabs disappear

60:01

and select it again they reappear this

60:04

tab allows me to dive in and actually

60:06

further customize these visualizations

60:08

to how I want them to appear I can even

60:11

move them to let's say a new sheet and I

60:14

can title this something like histogram

60:16

and then move it the charge Stone always

60:19

necessarily appear just like that let's

60:21

actually do a deeper analysis to see

60:23

what are the different job title short

60:25

columns available I want to clear all

60:27

these different filters on here so I'm

60:29

going to come back up here with this one

60:31

row selected come into editing sort in

60:33

filter and I'm going to say hey clear

60:35

all the different filters now selecting

60:38

column A going into insert and into

60:40

recommended charts it's recommended this

60:43

clustered bar chart which is actually

60:45

what I want to view so double clicking

60:47

on this this provides me a breakdown of

60:50

all the different counts of the

60:52

different job titles within our our data

60:56

set and we can see things like data

60:58

scientist engineer and analyst are some

61:00

of the highest amount of job postings in

61:03

this data set now unlike our histogram

61:05

example this actually provides this data

61:08

in a pivot table which we're going to be

61:10

going into in the pivot table chapter

61:12

which allows me to further manipulate

61:14

the data so say I want to actually sort

61:16

this I could rightclick the values right

61:18

here and clict hey sort smallest to

61:20

largest and then closing out this pivot

61:23

table tab right here I can actually see

61:26

what is the highest amount of job

61:28

compared to the lowest which is cloud

61:32

engineer now there's remaining tabs

61:34

we're going to be going and hopefully

61:35

rapid fire in order to cover these as I

61:37

find I'm using these less frequently

61:39

than these other tabs that we previously

61:41

talked about the draw tab allows you to

61:43

well draw on your spreadsheet so I can

61:46

just write on it if I wanted to but I

61:48

don't really find myself doing that

61:49

except for maybe being I'm building

61:50

dashboards besides that use case is

61:52

pretty rare if I want to end do this

61:54

drawing right here I can come up here

61:56

and click undo or I can select contrl Z

61:59

and it'll remove it page layout tab is

62:02

great if you're having to print out any

62:04

data for those co-workers that are

62:05

living in the past and don't know how to

62:07

accept things digitally you can do

62:09

everything from adjusting your page

62:10

layout to adjusting the scale that

62:12

you're actually viewing things now

62:14

personally I find myself more using

62:16

these sheet options right here so if I

62:18

go to this job count tab right here if I

62:20

wanted to I could turn off the grid

62:24

lines on here as you can can see it got

62:26

white on the background I really like

62:28

that now if I wanted to make sure they

62:30

had actual grid lines around my table I

62:32

come back to the Home tab and for here I

62:35

can select borders and from there I want

62:37

to put all borders on there so now I

62:39

look like I have this table right here

62:41

along with my graph super fancy next up

62:43

is formulas this is where you need to go

62:45

if you can't remember a function that

62:47

maybe you want to use if it's a text

62:49

function you come in here select

62:50

something like text you can scroll

62:52

through and actually see even a

62:54

description of of the different

62:56

functions that are available so in this

62:58

case replace it tells you hey replace

63:00

this part of a text string with a

63:01

different text string depending on what

63:03

version of excel you have and the newer

63:04

ones you'll have this insert python to

63:07

insert python functions and then finally

63:09

they have more advanced features with

63:12

maintaining and updating and formatting

63:14

your different formulas and functions

63:16

which we'll be diving to in the next

63:17

chapter now besides the home and insert

63:20

tab the data tab is the next tab that I

63:24

find myself using all the time in

63:26

chapter 7 we'll be diving into Power

63:29

query and we're going to be focusing

63:30

heavily on this getting transform data

63:32

and also queries and connections and

63:35

then in chapter 8 when we get to power

63:37

pivot we're going to be going into

63:39

managing our data model with power pivot

63:42

in chapter 4 we're going to be going

63:43

into this forecasting and we're also

63:45

going to be adding in some extra add-ins

63:47

that are going to appear in this data

63:48

tab now I sort of skipped over the data

63:50

types and sort and filter because we've

63:51

saw them on the Home tab they're just

63:54

conveniently located here in bigger

63:56

format for you use also all right this

63:58

tab on review is probably the least

64:00

likely for me to actually use I can

64:02

actually go through and check things

64:03

like spelling and add comments or even

64:05

protect my sheet besides that I'm not

64:08

finding I'm using that this often view

64:10

tab is similar to the review Tab and

64:12

that I'm using it a little bit more you

64:14

can change the format of how you

64:15

actually want to view things but mainly

64:18

I'm finding myself using this the most

64:20

of freeze pains let's say you see I'm

64:22

scrolling down here and I don't know

64:24

what the job or what the he headers are

64:26

right here so going over to this data

64:28

tab I can actually come in here to

64:29

freeze panes and select freeze top row

64:32

or even freeze First Column so in this

64:35

case that top row actually stays up

64:36

there and I really like it like that now

64:38

let's say I want to freeze both the top

64:40

row and that First Column there's not

64:41

really a selection for that so here's

64:43

what you can do you can come over here

64:44

to freeze panes and select unfreeze

64:46

paines and then select something like a

64:48

cell like B2 that means I want

64:50

everything above this and to the left of

64:52

it to freeze so now when I select freeze

64:54

panes this upper or top row is actually

64:57

Frozen and then the actual First Column

65:00

is Frozen as well all right final tab is

65:02

help and I'll be honest I think this is

65:05

pretty useless if I get stuck with

65:06

anything along the way I'm finding

65:08

myself navigating to something like chat

65:10

GPT and it's helping me a lot quicker

65:13

than trying to navigate through this

65:14

help box that it provides and I'm

65:16

already getting an error message with

65:18

even accessing it so you can see how

65:19

often I even use it

65:22

then now we've been doing a lot of

65:24

manual clicking with using the ribbon

65:27

and I think a good resource that goes

65:29

with this is shortcuts so if you come

65:31

inside of the resources folder we have a

65:34

Excel file here called Excel shortcuts

65:37

and what this has in it is a list of all

65:39

the different shortcuts that I find

65:41

myself using anytime I'm inside of excel

65:44

so it's worth having all of these I'm

65:46

not going to lie committed to memory it

65:48

looks like a long list but I'm telling

65:49

you by the end of this you're going to

65:51

have all of these basically committed to

65:52

memory they're going to be timesaver now

65:55

although I shed on people that print out

65:57

stuff this would be something that I do

65:59

recommend actually printing out and

66:01

having next to you so that way you can

66:02

reference really quickly while going

66:04

through this course all right now I know

66:06

we move fast through that but we're

66:08

really going to be diving into as I

66:09

called out during this lesson all of

66:12

these different tabs even more as we

66:14

advance through all the different

66:15

chapters that was more of a sneak peek

66:16

into what you're going to be exposed to

66:19

coming up in this course all right for

66:21

those that purchas the practice problems

66:22

you have some problems to go through and

66:24

actually experiment more with with the

66:26

tabs in the next chapter we're going to

66:27

be jumping into functions and also more

66:30

specifically formulas order to build

66:32

them out and form data analysis on that

66:34

data science job posting data set with

66:36

that I'll see you in the next

66:40

one all right welcome to this chapter on

66:43

formulas and functions in this lesson

66:46

we're going to be focusing specifically

66:48

on going a deep dive and understanding

66:51

formulas then in all the follow on

66:54

lessons this we're going to spend the

66:55

majority of our time working on

66:57

functions for that we'll be exploring

67:00

the entire function Library focusing on

67:02

the key functions within this library

67:05

that I find that I'm using time and time

67:08

again in data analytics so what are we

67:10

going to be doing in this lesson well

67:11

we're going to be focusing on a

67:13

fictitious data set we're going to keep

67:15

it small in order for us to get more

67:17

familiar with operating with formulas

67:19

and operating on this data set

67:21

specifically by the end of this we're

67:23

going to be able to input into into this

67:26

worksheet a number of years of

67:27

experience or total salary and be able

67:30

to see whether these jobs meet those

67:33

conditions specifically me that I meet

67:35

both of those conditions so for this you

67:37

can follow along by opening that

67:39

formulas intro workbook in this workbook

67:41

will be staying in this data sheet right

67:43

here all the different answers when we

67:45

get to the math operators comparison

67:47

operators or cell referencing are shown

67:49

via that sheet but we'll just be

67:51

sticking for data for

67:54

now first as math operators and as shown

67:57

by this table here you can use a variety

68:00

of different symbols for to conduct

68:02

different multiplication subtraction

68:04

division operations that you want to do

68:06

so let's dive into testing some of these

68:07

out we're going to be filling in each of

68:09

these columns that correlate with the

68:11

associated job title as we go through

68:13

this so the first one's going to be

68:15

experience pretty simple right we talked

68:16

about before in order to reference

68:18

another cell we would use an equal sign

68:21

and then from there we can either type

68:22

or select a cell I'm going to recommend

68:25

just typing it to make it go faster C3

68:27

it's highlighted blue because that's the

68:29

cell that's highlighted then we'll be

68:30

using the autofill feature of this to

68:33

fill in all the cells below and we

68:35

notice that it updates to here this

68:37

one's equal to C12 which correlates to

68:39

this one right to the left of it so

68:41

let's calculate our total salary and

68:43

this is going to be taking our annual

68:44

salary in column D and adding it to our

68:47

bonus Max in column e so we can do this

68:50

by specifying

68:52

D3 plus E3 and from there there pressing

68:56

enter once again to autofill it I select

68:58

that cell that I want and drag it on

69:00

down now if I want to calculate what is

69:02

the rate of bonus or the bonus rate that

69:05

is going to be the bonus divided by that

69:09

salary so in this case E3 / D3 once

69:15

again going to use autofill drag and

69:16

drop it all the way down now for all

69:18

these values I don't like what it's

69:19

formatted as right now I'm actually

69:21

going to change this to a percentage and

69:23

I want to see one decimal place so I'll

69:26

press this one to expand out one now

69:28

anytime I do any type of mathematical

69:30

operation in Excel I always want to try

69:32

to confirm it that it's correct I did

69:35

the operation correctly so in the case

69:37

of this bonus rate I can do this by

69:40

confirming what we got for total salary

69:43

previously so if we took that bonus rate

69:45

is which we want to confirm right so

69:47

we're going to take that and multiply it

69:50

times our annual salary right so that

69:53

should give us that bonus rate right

69:55

there then if we wanted to like we said

69:57

we want to confirm total salary right

69:58

here so I can just add in that we want

70:02

to also add in that annual salary itself

70:06

and we do have that total salary right

70:08

here to actually confirm what's going on

70:10

dragging it down and doing an autofill

70:12

all these values look like they

70:14

correlate to what it should be for total

70:16

salary so I feel we calculate a bonus

70:18

rate correctly now going back into the

70:20

formula itself you can see we have

70:22

multiple operations in here how do we

70:24

know whether multiplication addition

70:26

subtraction what comes first well really

70:29

if you know the order of operations it

70:31

really is the same here here the

70:33

different operators listed in their

70:35

order of Precedence exponentiation comes

70:38

first multiplication division or second

70:41

then addition and subtraction are third

70:43

it's Then followed by concatenation

70:45

which we did in one of the previous

70:46

lessons followed by the comparison

70:48

operators which we're about to get

70:52

to so with that segue here we are

70:54

comparison operators

70:56

for this you probably are familiar with

70:57

the first three the last three are

70:59

something that get a little bit more

71:00

complicated whenever you have a greater

71:02

than or equal to less than or equal to

71:04

or in this case a not equal to so

71:07

previously I just sort of did a cursor

71:09

check to make sure this confirmed t

71:11

total salary column equals this other

71:13

total salary column but imagine you have

71:16

hundreds of thousands of rows how can we

71:17

actually compare this and find these

71:19

values well what we can do is we can say

71:22

hey is G3

71:25

equal to I3 this looks a little bit

71:28

confusing right CU you have two equal

71:30

signs in there but everything to the

71:31

right of the equal sign it's basically a

71:33

comparison and from there it either ends

71:36

up as a true or a false and we can drag

71:39

and autofile this in and everything is

71:41

true similarly if we want to find

71:43

something like is the bonus Max greater

71:45

than the annual salary we can do hey is

71:48

bonus Max at E3 greater than that at D3

71:53

and the typical of any data a science

71:55

job none of these really exceed that at

72:00

all all right now that we're familiar

72:02

with math operators and also comparison

72:04

operators let's dive deeper into cell

72:07

referencing and we've been doing this

72:09

previously whenever we reference another

72:10

cell like A2 but we're going to add a

72:13

little twist to this I'm going to go

72:14

ahead and hide some of these columns

72:16

that way we clear up the Clutter going

72:18

to hide column F by right clicking it

72:20

and selecting hide then I'm also going

72:22

to select all the columns H through k

72:25

and also hide them want everything to

72:28

appear on the same sheet so we're going

72:29

to be referencing this table down here

72:32

for this portion of the exercise and

72:35

this is potentially goals that you may

72:37

have when you're trying to land a job

72:39

you may know how many years of

72:40

experience or you should have know how

72:42

many years of experience you have along

72:43

with a goal total salary that you want

72:45

to achieve and so we're going to be

72:47

building out formulas with this in order

72:49

to be able to find out which of these

72:52

jobs actually meet our conditions of the

72:55

expected years of experience and total

72:58

salary for so for this we'll go with

72:59

that I have five years of experience

73:01

then I'm looking at

73:03

$90,000 the first we want to calculate

73:05

in column L is whether it meets our

73:08

experience so for this we'll say hey is

73:11

C15 right here less than or equal to the

73:16

value right here in our experience and

73:19

as expected five is less than or equal

73:21

to basically equal to 5 it's true now

73:24

we're going to run a problem now when we

73:25

try to autofill this if I try to

73:27

autofill this down I'm getting this one

73:30

is false and then these all is true but

73:33

I would expect especially this AI

73:34

specialist at three it would be false

73:37

and so let's actually inspect this well

73:40

as we can see from this this is

73:42

referencing well c23 which is way down

73:45

here but it's still referencing the

73:47

correct C11 right here the problem is we

73:50

didn't really want this value up here

73:54

this C15 to actually change whenever we

73:56

went to do the autofill down below it so

73:59

what we can do here is provide a fixed

74:02

reference of that cell in order to do

74:04

this we're going to insert those dollar

74:06

signs that we saw

74:08

previously before the column and then

74:10

also the row so in this case I have C

74:14

locked and I have 15 locked now the

74:16

formula itself doesn't change at all but

74:18

now when I drag and drop this down all

74:23

of these are updating correctly as

74:24

expected AI specialist is going to be

74:26

false whenever I actually click on it to

74:28

inspect it it's still referencing that

74:30

C15 C11 next we're going to move on to

74:33

column M of seeing if it meets our

74:35

salary requirements so for this one

74:37

we'll be seeing hey is the salary or

74:41

total salary in G3 greater than or equal

74:45

to our total salary down here of 90,000

74:49

now we already know we need to lock c16

74:52

of this 990,000 because we're going to

74:53

be autofilling it down I can manually

74:56

type in the dollar signs but a shortcut

74:58

to this is just pressing F4 if you're on

75:01

a Mac you'll need to press function F4

75:04

anyway this locks this in so now

75:07

whenever I drag and drop this down as

75:10

expected the only other one that's less

75:11

than 990,000 is this data analyst rule

75:13

right here now I want to play with this

75:15

just a little bit more so we talked

75:16

about this right here putting a dollar

75:18

sign in front of the column and then a

75:20

dollar sign some of the row is a fixed

75:23

reference they also have what is called

75:24

a mixed reference so I'm going to go

75:26

ahead and put my cursor right there next

75:29

to G3 I'm going to press F4 and it's

75:32

going to do the absolute reference but

75:34

if I press it one more time it's going

75:36

to do a mix reference if you notice

75:38

there's only a dollar sign in front of

75:39

the three or if I press it again there's

75:41

only a dollar sign in front of the G now

75:45

technically this is going to work but

75:46

fine because we're going to now lock

75:48

this G column for this but it's going to

75:50

allow the three to update so I'm going

75:52

to show you this now by actually

75:54

dragging and dropping this down and from

75:57

there inspecting that last cell contents

76:00

we can see that that g is locked as

76:01

expected but it moved down now instead

76:04

of locking just the column we could also

76:06

lock the rows so I could also do change

76:09

up c16 now instead and lock the rows of

76:13

c16 cuz we're going to still stay in

76:15

that c column right there pressing enter

76:17

now autofill we don't have to just go

76:19

down we can also go up so inspecting it

76:22

locking it didn't really change by only

76:24

locking the row of 16 so let's wrap this

76:27

all up by actually def finding out which

76:29

of these actually meet both of our

76:32

conditions of 5 years and 990,000 well

76:34

it turns out that behind the scenes true

76:37

is equal to 1 and Z is equal to false so

76:40

if actually were to take this and add

76:42

this true to this true right here we

76:44

should get two autofilling it all the

76:47

way down we have two1 2 1 so basically

76:50

confirm that hey zero yeah false is zero

76:53

because 0 plus 0 is Zer now I recommend

76:56

instead we're going to be going through

76:57

and doing L3 * M3 so that way anytime

77:00

either one of these are true they will

77:03

return a one and now in order to get a

77:06

true or false back on whether it meets

77:08

both we can select that N3 and see hey

77:13

is it equal to one type over there equal

77:16

to one and it evaluates to true so now

77:19

I'm going to go ahead and just hide

77:21

these columns so we can actually see

77:22

this a little bit better but we can

77:25

find values in here that meet our

77:28

conditions of the 90,000 or 5 years and

77:30

let's say we're doing job searching and

77:32

it lasts over a year um we have to

77:34

change this to six this will

77:36

automatically update the formulas that

77:38

we've used here as shown here so that's

77:41

our intro to formulas and for me the

77:43

hardest thing to wrap my head around

77:45

when I was first tackling this was

77:46

around absolute and mixed references so

77:49

we have some practice problems for those

77:51

that purchased the course practice

77:52

problems in order to go through and test

77:54

this out and understanding what happens

77:56

whenever you lock the row or lock the

77:58

column all right and after that we'll

78:00

next be diving into an intro into

78:02

formulas which I'll be covering for the

78:04

remainder of this chapter with that see

78:06

you in the next

78:10

one for this lesson we're going to be

78:12

focusing on an intro into functions

78:15

specifically we're going to be going

78:16

over all the different functions that

78:18

we're going to be deep diving within

78:20

this chapter itself along with some

78:22

common problems you may run into and

78:24

errors and how to troubleshoot it to do

78:26

this we'll be continuing on from that

78:29

data set that we used in the last lesson

78:31

specifically we'll be calculating things

78:33

like averages and counts and how many

78:36

jobs actually meet our goals and we'll

78:39

be using functions for this so you can

78:42

continue working in that workbook that

78:43

you had from last time or open this

78:46

function intros workbook in this

78:49

function intros workbook I've gone ahead

78:51

and moved our job goals over here to

78:53

that column RNs and then added in this

78:56

bottom portion right here for the

78:57

averages and total counts really you can

79:00

do and manipulate as you

79:03

want so why use functions let's look at

79:06

a couple quick examples on the

79:08

importance of these things let's say we

79:10

wanted to get the average of each one of

79:13

these Columns of experience annual

79:14

salary and bonus Max previously we know

79:17

we can actually reference each one of

79:19

these cells to calculate the average we

79:21

wanted to do that we would have to

79:23

actually add up all the values so I have

79:25

to go through select C3 C4 all the way

79:28

down to

79:29

C12 and we would need to divide it by

79:33

that total number of 1 2 3 4 5 6 7 8 9

79:37

10 in that case we' get the average also

79:40

that me count that 10 wasn't necessarily

79:42

perfect so I don't really recommend

79:43

doing this but anyway nonetheless we can

79:45

actually do autofill to calculate the

79:48

averages as the is as well as it

79:50

automatically update the referencing

79:52

correctly to it but I don't recommend

79:54

doing that instead I recommend using

79:57

functions specifically we can use

79:58

something like the average function as

80:00

soon as I start typing a function a in

80:04

this case all the functions that have

80:06

the a name pop up if I wanted to well I

80:08

do know I want average right here I can

80:11

select it it provides a brief statement

80:13

of what it's actually going to do and

80:15

then I can doubleclick it to insert it

80:18

below here it actually specifies what's

80:21

going on with this function here and

80:24

specifically to provides me to hey

80:26

provide in these numbers now I could

80:28

select these number by number as we can

80:30

see that there's in Brackets here this

80:32

number two that means it's an optional

80:34

parameter but instead what we'll do is

80:36

we'll just provide a range providing it

80:39

from C3 all the way to C12 in that case

80:43

I got 5.3 similar to above and then

80:45

dragging this over we can get all the

80:47

other values as well as a quick example

80:49

also previously we had made this sort of

80:52

convoluted formula in order to calculate

80:54

calate whether we met both conditions of

80:58

mean our experience and also our salary

81:00

which we're specified over here well

81:02

there's actually a formula for that and

81:04

it's called the and formula and what it

81:07

takes for its arguments are logical

81:09

values so it can take a logical one for

81:11

the first parameter I can specify L3 and

81:15

then for the second parameter I can

81:17

specify M3 and notice how this second

81:20

parameter now highlights or becomes more

81:22

bold as I put it in so you can keep

81:24

track of where you are in the formula

81:26

any I'm going to close the parenthesis

81:27

press enter and it evaluates to True

81:30

dragging it all down these should match

81:32

these other ones and yeah this is

81:34

definitely something I'd use over these

81:35

formulas that I've used

81:38

before so let's dive into this formula

81:41

tab more and understand the capabilities

81:44

that we're going to be carrying out the

81:45

next lessons in this chapter the most

81:48

powerful of these especially for those

81:49

new to excel is this insert function

81:52

anytime you're looking for a function

81:54

and maybe can't can't recall the name

81:56

and you're not sure what even starts

81:57

with you can put something in here so

81:59

say I wanted maybe the average I can

82:02

type in average and then everything that

82:04

basically calculates a different average

82:07

off of it even if they're closely

82:08

related like this rank average will pop

82:10

up in here along with a description

82:12

below explaining it if you've used a

82:15

formula recently you can come in here

82:17

under recently used and I frequently

82:20

find myself just going back to this in

82:21

order to select something I may have

82:23

used recently now in the next seven

82:24

lessons we're going to be diving into

82:27

each one of these all the way through it

82:29

from logical and text to look up and

82:32

also math and trick now one note we

82:34

won't be going into detail on this

82:37

financial functions because I find

82:39

they're sort of nuanced but we will be

82:41

going into all the different ones that

82:43

I'm using on a daily basis as a data

82:46

analyst that aren't specific to

82:49

financial

82:52

applications so let's get into

82:54

understanding the basics about formulas

82:57

by calculating these different counts

82:58

and especially counts around whether any

83:01

of these jobs meet our goals for this I

83:04

know I want to use a count function so

83:06

I'm going to go to this insert function

83:08

I'm going to type in count now there's a

83:10

bunch of different ones that pop up

83:12

count itself just counts the number of

83:13

cells in a Range that contain numbers it

83:15

has to have numbers in it if I wanted to

83:17

do something more around text I would

83:19

say hey count the number of cells in

83:21

range that are not empty I could do even

83:24

do something conversely of counting the

83:25

number of blank cells for us we want to

83:27

actually do count so as we showed before

83:30

I'm just going to come in type count

83:32

it's going to prompt me that I need to

83:33

at least put at minimum a value and I

83:36

want to count all these cells here so

83:39

using autofill to fill it over um we can

83:43

see that all the different values are 10

83:44

nothing really spectacular here but now

83:47

let's get into a pretty unique use case

83:50

of count so in this scenario that I'm

83:52

count trying to calculate in cell c16

83:54

I'm trying to find out how many jobs

83:57

above here in these 10 right here how

84:00

many meet our goal of less than or equal

84:03

to 5 years and I want to count the

84:06

number of these so I know I want a type

84:09

of count I can go into insert function I

84:11

know it's here inside these different

84:13

statistical functions specifically I

84:16

have these different counts right here

84:19

and I'm going to scroll over this count

84:20

if right here and it's going to provide

84:21

me a description it says Hey counts the

84:23

number of cells Within range that meet

84:25

the given condition and that's what we

84:28

want to do we want to meet a condition

84:29

of a certain amount of experience now it

84:31

provides this box in order to help me

84:34

input in these values so for the range

84:37

here what I can do is specify hey I want

84:39

to count inside of here if they meet a

84:43

certain criteria and just going back to

84:45

that range right here we can see that it

84:47

already input all those different values

84:50

into an array likee object okay so the

84:53

criteria right now is NX want to put

84:54

something in here I can also press this

84:57

box and it'll make it disappear and I

84:59

want to compare it to this experience

85:02

but I want it to be less than or equal

85:04

to five so I can press enter to accept

85:06

it but the problem is it's going to

85:11

evaluate whether five is any of these

85:15

columns here and right now we see that

85:17

there are two I'm going go ahead and

85:18

close up so we can see this better right

85:20

now we can see that there's two fives in

85:22

here that's not what we we want we want

85:25

to see everything that is less than or

85:27

equal to 5 so instead what we need to

85:30

put in here is less than or equal to 5

85:34

now I'm going to press enter and we're

85:36

going to get an error this is pretty

85:38

common whenever you are manipulating

85:41

different formulas and you have in this

85:43

case I have this less than or equal to

85:45

right here so Excel is confused by this

85:48

what we need to do is actually put

85:51

parentheses around this which basically

85:54

sort of makes it into a string or text

85:56

if you will but now it knows hey I want

85:59

you to look for less than or equal to 5

86:01

I want you to evaluate this entire thing

86:03

pressing enter bam we have six values

86:06

here that are less than or equal to five

86:10

now similarly I can drag this over

86:12

because we want to also do this for

86:13

experience but I don't want to do less

86:16

than or equal to five I want to do

86:17

greater than or equal to

86:20

90,000 and in this case we have nine cuz

86:23

we only have have one that's less than

86:25

this but as you find out on this course

86:27

I don't like hardcoding values into my

86:31

formulas in this case I have five inside

86:34

of here but I'm already having five

86:35

right here what happens if I want to

86:36

change this maybe to say something like

86:38

three well it's not going to actually

86:41

update these values right here so I'm

86:44

going to go ahead and actually change

86:45

that back to five and we're going to

86:47

make another formula that actually fixed

86:49

this so I want to drag these down but we

86:52

actually didn't lock either one of the

86:54

these cells and it will cause errors if

86:57

we do so I'll just select right next to

86:58

it press F4 next to C3 I'll do the same

87:02

of f4 doing the same in this cell as

87:04

well all right now I'll take this and

87:06

I'll drag this down so now let's

87:09

actually fix this to be more Dynamic we

87:11

don't want it to be less than or equal

87:13

to this five right now what we can do is

87:15

that Amper sand operator and then from

87:19

there put in reference to S3 which

87:24

contains our five pressing enter bam we

87:26

got six same thing here I can delete

87:29

that 90,000 put in an Amper sand and

87:32

then from there we're going to be

87:34

basically putting it to mashing it

87:36

together with that 90,000 and it

87:38

evaluates now when we change this

87:39

experience to say something like two we

87:42

can see that it actually updates

87:44

appropriately to see that oh only one

87:46

job meets this requirement so pretty

87:49

cool I'm going change that back to

87:52

five now frequently you're going to run

87:54

into errors with your formulas let's say

87:57

I wanted to divide one by zero not a

88:00

good thing that we need to do anyway I'm

88:02

going to get this error right now you

88:03

can notice it because it has this green

88:04

check on the upper left hand corner but

88:07

also it starts with this hashtag and

88:09

it's saying hey you have a divide by

88:12

zero error I can even come down into

88:15

here and it tells me even more on this

88:19

provides help on this or if I wanted to

88:20

even ignore it now in this sheet of this

88:23

work workbook I have a bunch of

88:25

different errors in here that you may

88:27

run into from time to time again and

88:29

we're going to be running into these

88:31

errors as we go through the rest of this

88:32

chapter so if you get stuck along the

88:35

way while we're going through this I

88:37

feel like this is a good reference for

88:38

you to maybe save somewhere in order to

88:40

understand what is going on with the

88:42

different errors you may encounter now

88:44

the biggest time saer I've found with

88:46

any of these errors is using some sort

88:49

of chatbot specifically me I'm going to

88:50

go to something like chat GPT or even

88:52

claw they're going to be able to provide

88:54

really quick help in understanding what

88:56

an error is and what I need to do to fix

88:58

it all right so now it's your turn to

89:00

dive into and test Out These intro into

89:03

functions and play with them and

89:04

experience some of the errors of your

89:06

own after that we'll be diving into

89:08

logical functions a major type of

89:10

function that you need to be aware of

89:11

with that I'll see you in the next

89:16

one now that we have the basics down on

89:19

formulas and also functions we're going

89:21

to be moving into one of the most

89:23

important typ of functions to know

89:26

logical ones the most popular of these

89:28

are an if condition basically looking at

89:30

something and then providing a response

89:33

based on it so for this analysis we're

89:35

going to be jumping into our data

89:37

science job salary data set but we're

89:40

only going to focus on the first 20 rows

89:43

of it here and on the next few lessons

89:45

as well as I don't want to overwhelm you

89:47

with the all the data just yet now for

89:50

the final results we're going to be

89:51

doing two major things the first is

89:54

determining within this list of jobs

89:57

whether they meet our conditions of

89:59

finding the job we want of a data

90:01

analyst or business analyst and will

90:03

Market not desired or Ro desired

90:05

additionally we're going to do a common

90:07

practice and analytics of bucketing

90:09

basically taking those salaries and

90:11

depending on the amount value putting it

90:13

into a certain bucket for us we're going

90:15

to be looking at whether they have

90:17

salary data in this data set or more

90:20

specifically if they are greater than

90:22

our goal of 85,000

90:27

so why are these logical functions

90:28

needed well let's jump into that last

90:30

data set real quick and simplify how we

90:33

can actually use these as a quick

90:35

example previously in this P column we

90:37

were evaluating whether they met both of

90:40

our conditions of experience or salary

90:42

we can use an if statement in order to

90:46

clarify this so I can specifically call

90:48

out with an if statement saying if it

90:51

has The Logical test that we want to

90:54

actually evaluate so I'm going to put in

90:56

P3 in this case as it's going to return

90:58

true or false and then from there the

91:01

next value in there is value if true

91:03

which what do we want to return if it is

91:05

true well that our goal is met and then

91:08

if it's not met we want to have well not

91:11

met okay and then this whenever we drag

91:14

this down will provide not met or goal

91:18

met depending on if this is true or

91:20

false and so that's the power of these

91:23

if statements in helping us actually

91:25

provide this

91:28

value so that was just a quick example

91:30

of if let's actually jump into some more

91:32

examples so you get more familiar with

91:33

how to use this so here we are in this

91:35

data set and I don't need all the

91:37

columns of this data set so I'm just

91:39

going to select the columns that I don't

91:41

need I'm going select B through G and

91:44

then hide it additionally I'm not going

91:46

to need I or J so I'll hide these as

91:49

well so our first goal is to identify

91:52

whether these jobs meter conditions of

91:54

either a data analyst or a business

91:56

analyst we're going to start simple by

91:58

just finding out which one is a data

92:00

analyst first and then which one is a

92:01

business analyst and meets those

92:03

conditions so once again we'll start

92:05

with that if condition and for this

92:07

we're going to put in that logical test

92:08

remember pretty the example we need to

92:09

have a return either true or false so

92:13

we're wanting to check whether senior

92:14

data engineer in A2 is equal to data

92:19

analyst in K1 now we're going to be

92:22

autofilling this down so we need to make

92:25

sure that the A2 we're fine with it

92:27

actually adjusting as necessary K1 we

92:29

want it to lock at least lock on the row

92:33

value of one then if it's true we'll be

92:35

roll desired and if it's not it's not

92:37

desired as expected senior data engineer

92:40

is not desired let's drag this all the

92:42

way down and just double checking it we

92:44

see that the data analyst roles are R

92:45

desired okay so I can drag this over now

92:49

and just to double check it shifted over

92:51

to B2 but it's still but it's selecting

92:53

a right one of L1 so actually what I'm

92:55

going to do is I'm going to delete this

92:57

go back up here I'm sort of a

92:59

perfectionist I'm going to end up

93:01

locking that a value so it stays in that

93:04

a column none my values are going to

93:06

change here and then when I actually

93:08

drag it over I can check that okay A2 is

93:11

the correct one I once selected to

93:13

compare it to business analyst in

93:15

L1 okay then I'm going to autofill all

93:17

the way down looks like there's only two

93:20

business analyst roles here so now how

93:22

can we identify that it meets both of

93:25

those conditions both data analyst and a

93:28

business analyst well we're going to do

93:30

one approach first and it's called a

93:32

nested if statement and it's not really

93:35

the approach I'm going to recommend but

93:37

it's something that you should be aware

93:38

of so what I'm going to do is I'm going

93:39

to select cell K2 I'm going to go ahead

93:42

and copy this formula plugging it in

93:45

here we have it here and making sure

93:48

that it operates correctly yep it does

93:50

so how does this nested if statement

93:52

work well we're going to still evaluate

93:54

our first condition is the first role

93:57

evaluated as data analyst does it meet

94:00

that if it is we want to mark it as rule

94:02

desired now we get into what happens if

94:05

it's not a data analyst well now we want

94:07

to now check if it's a business analyst

94:09

so I'm going to close out this and what

94:11

we can do is I'm going to take this

94:13

business analyst formula right here

94:15

everything up to the if and I'm going to

94:17

go back in here and I'm going to drop it

94:20

in right here inside of the value if

94:24

false so it's an nested if statement an

94:27

if inside of another if so now if we

94:30

don't meet this first condition of the

94:31

value if isn't true it will go into the

94:33

nested if statement and start checking

94:35

this condition is now the software data

94:38

engineer equal to data analyst if it is

94:40

it's R desired if not it's not desired

94:43

so let's now drag and drop this all the

94:45

way down I'm going to expand this out a

94:47

little bit and now we can see if it's

94:50

data analyst we get rule desired along

94:52

now with if it's business analyst also R

94:57

desired but I'm not a fan of nessf as

95:00

they're hard to read instead I like

95:02

using the functions of and and or and

95:06

should be a little bit familiar because

95:08

we saw it from the intro lessons that we

95:10

did previously with and it evaluates

95:14

whether both conditions are true so in

95:17

this case I'll put in condition one of

95:20

B3 and then condition two of C E3 and

95:24

both conditions are true so it satisfies

95:26

as true dragging this all down in all

95:29

the following condition cases they're

95:32

not true for both conditions so

95:34

therefore it evaluates as false in or it

95:37

checks whether condition one or

95:39

condition two is true and then will

95:42

return true so inputting in the

95:45

conditions of B3 and C3 one of the

95:47

conditions here are true actually both

95:49

are dragging it down I expect yeah the

95:52

second and third rows are also true

95:54

where the final one both are false so

95:55

therefore it is false so let's run the

95:57

same Andor logic that we've run before

96:00

in order to determine which one we

96:01

actually use so in this one we're

96:03

checking whether both of these jobs of

96:05

data analyst and business analyst are

96:07

equal to this one here senior data

96:08

engineer as expected false and what

96:12

should we should expect for all of these

96:13

all of them are false because none of

96:15

these are going to be both data analyst

96:16

and business analyst so as you can

96:18

probably guess or it's probably going to

96:20

be the one that's going to work for us

96:21

we're evaluating whether either data

96:23

anal or business analyst are going to

96:25

match up to that value of senior data

96:27

engineer in this case we're getting

96:29

those tree values for data analyst and

96:31

True Values for business analyst so now

96:33

we're going to put that or function

96:36

inside of that if for The Logical test

96:39

and from there we can determine whether

96:41

it's rule desired not desired dragging

96:43

all down all of it's matching as

96:45

expected okay I'm going to go ahead and

96:46

hide these

96:50

rows so now what happens if we don't

96:52

want just a evaluate for a true or false

96:56

condition basically we want to evaluate

96:57

for multiple different conditions well

97:00

that's going to be something that comes

97:01

up if you need to ever bucket data which

97:03

we're going to be doing with salaries

97:04

now for this first one we're going to

97:06

just use a simple if statement we want

97:08

to determine whether a salary is greater

97:11

than 85,000 or if it's not we want to

97:14

just specify that the salary is low so

97:16

for this we're going to be evaluating if

97:18

H2 we're going to go ahead and lock that

97:20

H column is greater than that 85,000

97:26

which will lock that completely for the

97:28

85,000 then we want to say the salary is

97:31

greater than 85,000 conversely if it

97:34

doesn't meet this we want to say that

97:35

the salary is low I'm going to expand

97:38

this out a little bit and then we're

97:39

going to drag this down as expected we

97:43

have the values returning those are

97:45

85,000 and then this one at 35,000 it is

97:48

Mark is low now the problem we're

97:50

running into and why we need multiple

97:52

conditions is this is the salary is low

97:55

but there's actually no data there we

97:57

need specify in these conditions that

97:58

well there's no data so for this we can

98:01

use an ifs formula and what happens with

98:05

this is you provide a test and then a

98:07

value if true and that's just the first

98:09

one we can then provide another logical

98:12

test and the value of true so the first

98:14

thing I'm going to test is if there is

98:16

no value there I'm going to go ahead and

98:18

lock that H column as well and when I'm

98:20

looking for a blank I'm just going to

98:22

put in two quot Mark say signifying that

98:25

it's blank and the value of true is no

98:28

data okay put another comma we can see

98:30

we now we're on to logical test number

98:32

two the next thing we want to test is if

98:34

it's greater than 85,000 so we'll see H2

98:38

again locking that H and we want send it

98:41

if it's greater than or equal to that

98:44

85,000 which will lock if it is we want

98:47

to return back that salary is greater

98:50

than

98:51

85k and finally we're on to the final

98:53

logical test and basically we want all

98:56

of them to pass this condition so

98:59

instead of providing hey salary less

99:01

than 85,000 we're just going to pass in

99:03

true because we want it to be true and

99:05

we would expect this to be any values

99:08

between a number that are between 0 and

99:10

85,000 so like before we're going to

99:12

specify salary low running this we going

99:15

to expand this out and then drag this

99:19

down we have when it returns no data no

99:23

data salary less than 85,000 return

99:25

salary low and then whenever it's

99:27

greater than 85 the correct results now

99:30

if s functions are one of the more

99:33

complex functions to work with so you do

99:35

need some practice with this like for

99:37

those that purchased course practice

99:38

problems you have some now to go into

99:40

and actually try this out manipulate and

99:42

better understand how to work with this

99:44

with that in the next one we're going to

99:45

be jumping into my next favorite type of

99:47

functions math functions which heavily

99:49

used in data analytics all right with

99:51

that I'll see you in the next one

99:57

now in this lesson we're going to be

99:59

using math functions and also some

100:01

statistical functions in order to

100:03

perform Eda or exploratory data analysis

100:07

on our job posting data set and for this

100:10

we're going to be focusing on the five

100:11

major functions of count sum average and

100:16

also Min and Max and we're not only

100:18

going to focus on the core versions such

100:20

as just count but also the if an ifs

100:24

version so they have multiple different

100:26

versions that we're going to get to now

100:27

for our analysis we're going to be

100:29

diving into the full data set of the

100:31

data science job postings which has over

100:33

30,000 different job postings and in it

100:36

we're going to be specifically diving

100:38

into data jobs that are in the United

100:41

States for data analyst and we're going

100:43

to be able to use these sort of

100:45

different functions that incorporate if

100:47

and ifs in order to fine-tune in what

100:50

we're looking for one quick note you're

100:51

not limited to using un States and data

100:54

analyst you can use the scenario that

100:56

you're in of what country you're in and

100:58

what job title you're most interested in

101:03

instead so we're going to be filling out

101:05

this table right here and we're going to

101:07

start on Row three focusing on those

101:09

count functions first now the data set

101:12

is actually much larger than this three

101:14

columns I actually I'll unhide between a

101:17

through K but we're not using any of

101:19

these columns in between here so I'm

101:22

just hiding that them and making it

101:24

easier for us to work with for this

101:26

we're going to focus on the core

101:27

function of only count and we're going

101:30

to be looking at those that have all the

101:33

yearly salary data in it as you can see

101:35

over here that there's missing blanks in

101:36

here so we don't want to count those

101:38

that are missing anyway what I'm going

101:39

to do here is Select column M and as you

101:43

knew it selects the range of M colon M

101:46

and then from there press enter so what

101:48

we're finding is that around 22,000 jobs

101:51

out of these 30,000 we're going to find

101:52

out have salary data and how do I know

101:55

about that 30,000 well let's actually

101:57

see we can actually use instead we can

102:00

use a count a function which stands for

102:04

count all and it counts the number of

102:06

cells in a Range that are not empty

102:09

specifically I want to capture those in

102:12

the job title short column right here so

102:14

I'll do a colon alen running this we get

102:17

to see that it's around 32,000 jobs One

102:20

technical note before we continue these

102:22

are since we're doing the columns

102:24

themselves in this case the count M it's

102:26

also counting that column header in this

102:29

case so if we want to be exactly

102:31

accurate which in this case I just need

102:33

roundabout numbers if we want to be

102:35

exactly accurate technically we would

102:36

want to go in and S say subtract one to

102:40

get what the actual value is but frankly

102:43

I'm just trying to look at General

102:44

numbers right now I'm not too car about

102:46

one or two off so now let's dive into

102:49

analyzing this further on my needs

102:51

looking for specifically focus on the

102:53

United States first so we're going to

102:55

find those that have in the job country

102:58

here United States and for this we're

103:01

going to use the count if function and

103:05

this counts the number of cells within a

103:07

range that meets the given condition so

103:10

you provided a range in this case we're

103:12

going to provide the range of that

103:14

column K and then the criteria itself we

103:16

want to filter for United States which I

103:18

conveniently typed above so we'll select

103:20

it right there I'm also going to lock it

103:22

by pressing F4

103:23

and then running this we get that about

103:26

25,000 jobs contain United States so now

103:29

let's evaluate those data analyst jobs

103:31

using that same thing of ctif once again

103:35

we provide the range in this case we're

103:36

looking at that job title short column

103:38

and for this we want to look for data

103:40

analyst locking this cell we get about

103:43

9600 jobs for data analyst now next up

103:47

we're going to be using count ifs

103:49

specifically we're doing this because we

103:50

want to find jobs that contain not only

103:52

data an but also contain that they're

103:55

from the United States now we can't just

103:58

add these two columns together because

104:00

one it's going to as we once we add it

104:02

up we see that's even greater than all

104:03

the jobs there that's not what we

104:04

actually want we want conditions like

104:06

here on row 16 where it's a data analyst

104:09

and United States whereas something here

104:12

on roow 223 where it's a data analyst in

104:14

s that's not going to meet our condition

104:17

so we wouldn't count it so using count

104:20

ifs this counts the number of cells

104:23

specified by a given set of conditions

104:26

or criteria for this we need to specify

104:28

a range and then the criteria first

104:31

we'll focus on the range of a for job

104:33

title short and we're looking to match

104:36

that of data analyst which I'll lock by

104:38

pressing F4 then now we're moving on to

104:41

criteria range number two where for this

104:43

one we're looking at Job country now and

104:46

for that we want to look for the

104:48

criteria of United States locking this

104:50

with F4 closing this with parenthesis

104:53

and then running it we get around 8,000

104:56

jobs and this makes sense right because

104:58

it would be less than that 9,000 data

105:01

analyst because some of these aren't

105:03

going to be from the United States now

105:05

with how this is Flowing we could

105:06

actually make a visualization out of

105:09

this data right here so going into

105:11

insert and then recommended charts we

105:14

have here a funnel chart so I'm going go

105:17

ahead and insert that in and this

105:19

basically shows the funnel if you will

105:21

of jobs we have we started with almost

105:23

32,000 jobs and we got towards the end

105:26

of the jobs that we actually care about

105:28

us and data analyst at around 8,000 I'll

105:31

go ahead and move this off to the side

105:32

for

105:34

now all right next moving into the sum

105:37

function and the core one itself of

105:40

actual sum itself it's pretty simple we

105:43

have to just we're going to obviously

105:44

using salary year average column for

105:46

this because we want to sum up the

105:47

numbers in them and I'm put in that

105:49

column of M and we get the sum of values

105:51

there now unlike count where a count has

105:53

a count a or count all where we're

105:56

trying to find if there's blanks or not

105:58

that's not really applicable In Sum and

106:00

average and also in Min or Max so I'm

106:04

actually going to go ahead and just gray

106:05

these out because we're not going to

106:06

need them now moving into suth which

106:09

adds the cells specified by a given

106:11

condition or criteria this one is a

106:15

little bit more complex than we dealt

106:17

with with count because we first want to

106:20

provide the range that we're going to be

106:22

evaluating for a certain criteria which

106:25

in our case the range you want to

106:26

evaluate is job country because we're

106:29

evaluating for if it contains United

106:31

States which I'll lock with that four

106:33

but we're not summing the countries

106:36

because there are text column so we have

106:37

to provide this sum range which is

106:40

column M similarly once again we can do

106:42

that sum if looking for data analyst so

106:45

in this case we're going to be looking

106:46

at column A to evaluate if it has data

106:49

analyst in it and then from there the

106:51

sum range once again is going to be that

106:53

column M now the sum ifs similar to that

106:57

count ifs adds the cell specified by a

107:00

given set of conditions or criteria for

107:03

this one we provide the sum range first

107:07

so it gets a little bit confusing you

107:08

got to make sure that you're actually

107:09

reading the formulas in this case we're

107:10

going to use M because that's the sum

107:12

range we want to use and then we're

107:14

first going to evaluate for that job

107:16

title short that column A which we're

107:18

going to evaluate for data analyst and

107:21

then we'll evaluate for the job country

107:24

evaluating for United States closing the

107:27

parentheses and running this bam as

107:29

expected this value is less than that of

107:33

the data

107:36

analyst now moving into the last three

107:38

of average men and Max which I think are

107:40

actually more valuable than that sum one

107:42

we did I'm not going to walk through

107:44

actually typing in all these in because

107:47

now you've had a familiarity with how I

107:48

did the sum which follows the same

107:50

example for average men in Max feel free

107:52

to if you want to you can go through and

107:54

type it out on your own to get more

107:55

experience doing it but overall I think

107:58

this has some very unique insights from

108:01

it from this analysis we did in it we

108:03

can see that salaries in the United

108:05

States are around 125,000 where the data

108:08

analyst is only around 93 and

108:11

specifically us data analyst is around

108:13

94 so data analysts in general are lower

108:16

salaries than the other jobs in the data

108:19

science Industry as far as Min and Max

108:21

go we're having as low as

108:24

25,000 but we're having as high as well

108:27

at least for a data analyst up to

108:29

650,000 and apparently there's a job in

108:32

here around

108:34

$960,000 and you may be wondering what

108:37

jobs correlate to this $155,000 or

108:39

$960,000 well we're going to be diving

108:42

into that further when we get to that

108:43

lookup functions one last note on errors

108:46

before we go I commonly find the most

108:47

common error with these functions is a

108:50

value error and that usually occurs

108:52

whenever in this case we had column a

108:54

selected initially for criteria range

108:56

number one let's say we accidentally

108:58

selected multiple different columns for

109:01

this obviously we're not trying to

109:02

evaluate all the different columns we

109:03

only want to evaluate one column for

109:06

that criteria of if data analyst Falls

109:09

in it anyway when I run this I get a

109:12

value ER anyway this is a common one

109:15

that I see come up time and time again

109:17

so anytime you're going through this any

109:19

of these or the practice problems

109:20

themselves make sure you're

109:22

investigating to see that you've

109:24

actually input in the correct ranges to

109:27

evaluate cuz it's commonly causing those

109:30

value errors all right with that you

109:32

have some practice problems to dive into

109:34

and next we'll be diving into even more

109:36

statistical functions in order to really

109:38

dive into how deep you can go with Eda

109:41

or exploratory data analysis all right

109:44

with that I'll see you in the next

109:48

one we're now going to be taking this up

109:51

a notch shifting gears from focusing on

109:53

math functions now to statistical

109:55

functions for this we're going to be

109:57

using our job posting data set and

109:59

analyzing the salaries in this

110:02

specifically looking at common

110:04

statistical functions like median

110:06

standard deviation and even quartiles

110:09

once we have the basics we're going to

110:10

shift into an actual analysis looking at

110:13

what is the average salary of different

110:16

job titles and we'll even get a sneak

110:19

peek of visualizing it for this lesson

110:21

you can start by opening this syst

110:22

statistical functions workbook we're

110:25

going to be starting by filling in this

110:27

table here on the different statistical

110:29

functions we're going to be filling out

110:31

and we're still working with that data

110:32

set we did previously if you noticed

110:34

I've hidden a lot of the columns that we

110:35

won't be using for

110:38

this so we've done a few of these

110:40

different type of functions already

110:42

let's go ahead and fill these in for

110:43

count we'll be using the count function

110:45

specifically on that M column of salary

110:47

or average and like before we have

110:49

around 22,000 values for average we'll

110:51

be doing the same on that M column we

110:53

find that's around

110:55

123,000 for men we'll also run this on

110:57

the M column and that's around 15,000

111:00

for Max that's going to be around

111:02

960,000 so let's move on to our first

111:06

true statistical function we're actually

111:07

going to go into this to actually see

111:09

what it does and that's median it

111:11

Returns the median or the number in the

111:14

middle of the set of given numbers so

111:18

let's go ahead and type that out median

111:20

and in there we need to specify number

111:22

or numbers we can specify a range we're

111:24

just going to keep it simple right now

111:25

to actually show what this function is

111:28

actually doing it's selecting the middle

111:30

of numers so I'm just going to select

111:31

these top three numbers right now and

111:34

what I expect for this function to do is

111:36

to provide basically in a set of numbers

111:38

given provide the middle number so it

111:40

should provide us 140,000 which is the

111:43

center number of these three we don't

111:46

care about the center of just three

111:48

values we care about the center of

111:50

basically all of our different values so

111:53

I'm going to place the entire M column

111:54

into it and that is around

111:58

115,000 now why is this average higher

112:01

than this median well let's actually

112:03

visualize it I'm going to select this m

112:04

column and go to the insert tab going to

112:08

histograms I'm going to insert a

112:10

histogram and what this is showing is

112:12

the distribution of salaries from 15,000

112:16

all the way to 950,000 bottom xaxis is a

112:20

little confusing to read but it's

112:22

basically a range so this case 87,000

112:25

93,000 how many counts of salaries are

112:28

falling in between that and that's how

112:30

large the bar is next to it anyway

112:33

getting back to that original question

112:35

why is the average higher than the

112:38

median itself if you call back from

112:40

definition a median is the middle number

112:43

in our set of our list but our average

112:47

however is taking all the different

112:49

values and well averaging it out and as

112:52

we can see from it we have a large

112:54

amount of salaries around well

112:58

$100,000 but we do have some up here

113:01

that are getting close to a million

113:02

dollar these basically outliers are

113:06

causing us to have a higher average so

113:09

basically those values that are near

113:11

960,000 are dragging that average way

113:14

higher so that's why I prefer to use

113:16

something like the median when I can in

113:18

order to analyze these salaries because

113:21

they're not skewed by the these outlier

113:23

salaries that are just something that

113:25

you're probably not going to get all

113:26

right next up is standard deviation and

113:29

for this you have two options standard

113:31

dev. p and standard dev. s the P stands

113:35

for population and the S stands for

113:38

sample this data set is around 30,000

113:42

salaries and there's way more than

113:44

30,000 data science jobs available so

113:47

that's a sample of the actual population

113:50

so we're going to be using standard Dev

113:52

s and for this we can insert a range

113:54

into it so what does this value actually

113:57

mean well if we had something like a

113:59

normal distribution which our salary

114:01

data is somewhat close to that we'll

114:03

find that one standard deviation from

114:06

something like the average has in this

114:08

case right here 34,000 so if we went

114:11

above and below the average by one

114:14

standard deviation around 68% which is a

114:18

heck a lot of data is within this one

114:21

standard deviation so in our case if I

114:23

was to take the average and then

114:26

subtract this standard deviation along

114:28

with taking the average and then adding

114:31

the standard deviation around 70% of the

114:35

salaries are going to be between 75,000

114:37

and 170,000 but what if we wanted to be

114:40

more precise about finding say something

114:42

like where does 50% of the data actually

114:45

fall well we can use quartiles in this

114:49

case specifically calculating the first

114:51

and third quartile here's a graph that I

114:54

did from my python course which when you

114:56

get done with this course feel free to

114:58

check it out but anyway it looks at the

114:59

salary distribution of data analyst

115:01

United States has this histogram right

115:03

here very similar to what we plotted

115:04

previously in Excel but in it I'm able

115:06

to plot out cortile one where the

115:08

quartile one starts and then quartile 3

115:11

where that one starts so between this

115:13

quartile 1 and quartile 3 marker lines

115:17

50% of the data Falls here with this red

115:20

dotted line being the media again which

115:23

let's actually get to calculating this

115:25

so if we want to do something like the

115:26

cortile we're going to see that there's

115:28

a few different functions available for

115:30

this we have exclusive and inclusive

115:33

we're going to do inclusive first and

115:34

then I'll show The Exclusive after to

115:36

basically show how it's different so

115:38

this takes two arguments the first is

115:40

the array so I'll put in that range of

115:42

the M column and then lastly it takes

115:46

the quartile and we have one for the

115:48

first quartile two for the median three

115:50

for the third quartile anyway I have

115:52

these values over in the U column so

115:54

I'll just select that and use that for

115:57

this and for the second quartile we're

115:59

seeing that basically as a just red

116:01

that's also equal to the median now I'm

116:04

going to go ahead and get rid of these

116:05

Min and Max CU we can also use that by

116:08

with our quartile function and I'm going

116:11

to go ahead and drag and drop this up

116:13

and then also below so what we can see

116:17

from this with this first and third

116:18

quartile is that around 50% of the data

116:20

Falls between 90 ,000 and

116:23

150,000 so frankly when it comes to

116:26

using quartiles like here and standard

116:29

deviation I find myself more gravitating

116:32

towards quartiles anyway what about that

116:34

other quartile function specifically

116:37

that one around exclusive values well

116:40

once again I can select the array that

116:42

we're going to use we're going to use M

116:44

and then finally the quartile itself now

116:47

notice for this one this one doesn't

116:49

have a value of 0 and four that you

116:52

actually can put in for the Min and Max

116:54

it's exclusive so it excludes those

116:57

outliers basically of the Min and Max so

117:00

specifying that column next to it when I

117:03

actually drag this down we can see that

117:05

the Min and Max AR provided in this but

117:08

it's the same values for that se first

117:10

second and third quartile if you notice

117:12

here we get this numb error and as we

117:15

inspected when going through this

117:16

formula zero and four were not available

117:20

to actually input into the formula so

117:22

any time you're inputting things into a

117:24

formula that doesn't necessarily exist

117:26

you're going to get this numb error all

117:28

right the last function to investigate

117:30

is the mode and this returns a vertical

117:33

array of the most frequently occurring

117:35

or repetitive values in any array in our

117:38

case we'll once again provide column M

117:41

and surprisingly we find that 90,000 is

117:43

one of the most repetitive values if we

117:45

go back to that histogram we plotted

117:47

earlier we can see that the largest line

117:50

right here with a value of 19 25

117:52

occurrences occurs between 87,000 and

117:55

93,000 so this makes sense on the 90,000

117:58

being the

118:01

mode so let's get into some data

118:03

analysis Now by actually ranking the

118:05

average salaries of these different job

118:08

tiles I'm going to go ahead and hide the

118:10

columns v through R now in order to rank

118:13

the salaries of the different job tiles

118:15

that I have this list here for you where

118:17

you need to First calculate the average

118:19

salaries of each of these job titles so

118:21

for this we're going to be using as last

118:23

time average if first we need to specify

118:26

the range that we're going to be

118:28

basically running that if on not

118:29

necessarily the values but the range of

118:32

the job titles next we need to provide

118:34

the criteria for this we'll provid it of

118:36

data analyst which is in W2 and then

118:39

finally the actual average range of

118:42

column M dragging this all the way down

118:46

we have our different averages for all

118:47

the job tiles one note real quick in

118:49

future lessons we're going to be jumping

118:51

into using

118:52

median to evaluate these job tiles cuz

118:54

personally I like that more but that's a

118:56

slightly more complex problem so we're

118:58

going to stick simple for now anyway

119:00

with these advertisers we can now

119:02

actually rank it and this Returns the

119:05

rank of a number in a list of numbers it

119:07

size relative to other values in the

119:10

list so first we need to put in the

119:13

number that we want to rank in this case

119:15

we're want to do that of data analyst

119:16

and then from there they have the ref or

119:18

the reference array in this case we're

119:20

going to provide it right here from X2

119:23

all the way down to X11 now I can change

119:26

this from descending to ascending but

119:28

I'm going to keep it how it is now I'm

119:30

going to drag and drop all the rest of

119:31

these and we had a little bit of an

119:33

issue CU we have repeating numbers right

119:35

here it's obviously because I didn't

119:37

lock my cells appropriately so selecting

119:40

this range that I want to actually lock

119:42

and pressing F4 go ahead and lock that

119:45

and then we'll drag and drop this again

119:47

hopefully this works this time and boom

119:49

we have all these ranked from highest to

119:51

lowest we can see business analyst or

119:53

some of the lowest thata analyst not far

119:55

behind and Senior data scientist has the

119:57

highest I'm going to take this one step

120:00

further I'm going to highlight

120:01

everything from job title down to the

120:03

bottom salary for software Engineers

120:05

going to go into insert in here and go

120:07

to recommended charts and basically the

120:09

first one that pops up this clustered

120:10

bar chart I'm going to insert in and I

120:13

can just change the salary up here by

120:15

double clicking in here and I put

120:16

average salary of data science jobs and

120:20

there we have some data analysis is

120:22

actually viewing these one minor touch

120:24

to this I really don't like how these

120:25

are unordered right now so I could

120:27

actually go up here select these three

120:29

titles right here and then under the

120:31

Home tab select I want to actually

120:33

filter it and then order this rank from

120:37

well we'll say largest to smallest one

120:39

note you may not have been able to see

120:40

it but it actually rearranged the data

120:42

inside of our data set that's not a big

120:44

deal for me I'm not caring too much but

120:47

that is something that will be effective

120:48

whenever you do this anyway with this we

120:50

can see things like senior roles or

120:52

getting paid the most and things like

120:54

analyst are sometimes getting paid the

120:56

least compared to these all right you

120:58

now have some practice problems to go

121:00

into and thus practice your skills with

121:02

these statistical functions after that

121:04

we're going to be jumping in the next

121:05

lesson into arrays which is a super

121:08

powerful feature sort of new to excel in

121:10

the past few years all right with that

121:12

I'll see you in the next

121:16

one we're going to be now shifting gears

121:19

and jumping into a more advanced topic

121:22

of arrays and with arrays what you can

121:24

do is typing a formula in a single cell

121:27

we can use this to fill in cells below

121:30

it or cells to the side of it all with

121:32

one single formula so we're going to be

121:34

slowly working up to an easy then a

121:37

medium and then a hard problem and how

121:40

to use these first up with the easy one

121:42

we're going to go through and basically

121:44

identify all the unique job titles and

121:47

then go through and actually sort it

121:49

alphabetically using arrays next we're

121:51

going to move into to our median problem

121:53

of calculating the median salary if you

121:56

recall back to our last lesson we were

121:58

calculating the average salary based on

122:00

a job tile well we can use a raise to

122:02

calculate the median and then finally

122:05

one of the most hardest problems we're

122:07

going to get into actually looking at

122:09

based on the month how many different

122:12

jobs were submitted during that month

122:15

and before this we'll be using the Su

122:17

product formula and a combination of

122:19

other ones using arrays for this be

122:22

using the arrays formula Excel

122:26

workbook now before we jump into those

122:29

problems we need to First understand

122:31

that there's actually two different

122:33

types of arrays we're going to start

122:35

with the first one of modern dynamic

122:36

arrays which we've seen before and with

122:39

this what we can do is using a formula

122:41

we can specify a range to identify and

122:44

then whenever we press enter B2 to B5

122:46

it's going to actually fill in with all

122:48

these we can see that it's modern or

122:51

dynamic because it has this Shadow

122:54

around the edge if I select any of the

122:56

other ones and not the core one where

122:58

this one's actually highlighted when the

123:00

other ones these are grayed out taking

123:03

this a step further with array

123:04

multiplication we can actually go in and

123:07

multiply this column one of A2 to A5 and

123:11

multiply it times B2 to B5 anyway in

123:15

this sequence you can see that it goes

123:16

down 1 * 1 is 1 whereas 4 * 4 is well 16

123:20

anyway that's modern dynamic arrays

123:22

classical arrays let's say we want to do

123:24

the same thing in this case well we're

123:26

going to have to go about it a little

123:28

bit differently we need to select all

123:30

the cells that we want to fill in first

123:32

is a very key concept to get for right

123:34

first then from there we can start

123:36

entering our formula so I put in equal

123:38

in this case we want to do the same

123:39

array multiplication I'll take A2 to A5

123:43

times it time B2 to B5 now whenever I am

123:48

done with this and I want to actually

123:50

execute this I don't just press enter I

123:52

have to press contrl shift enter and

123:54

then it fills in the array notice it's

123:56

not grayed out around the edges like

123:58

this as a shadow this one does not do

124:01

that and all of the different formulas

124:03

are now filled in below this and you'll

124:05

notice that there's a curly bracket

124:07

around this this was used prior to

124:10

around

124:11

2020 and so you may come into contact

124:15

with Excel spreadsheets that have this

124:17

and if you don't know about it if you

124:18

come into here and say you want to like

124:20

mess with this formula and you press

124:22

enter you're going to get an error

124:23

message but now let's say we have some

124:26

additional values in it we'll say We'll

124:27

add five to each to the bottom of these

124:29

if I wanted to adjust this array if I

124:31

came in here and then change this to six

124:34

for both the bottom and the top and

124:37

press control shift enter it's only

124:38

going to adjust the ones that were

124:40

previously selected so now if I want to

124:42

include this bottom row right here for

124:43

modern dynamic arrays it's pretty easy I

124:45

can just come in here and adjust this to

124:47

six and this is done however for

124:50

classical arrays or classic arrays not

124:53

classical I have to actually select all

124:56

these different cells and then go in and

124:58

actually enter the formula that I want

125:00

to enter if I try and press enter it's

125:02

going to give me an error message and I

125:04

realize okay I have to press control

125:06

shift enter and it'll actually fill in

125:08

anyway the main point of this is

125:09

classical arrays or a mess we're going

125:10

to be focusing on Modern and or dynamic

125:12

arrays for the remainder of this course

125:14

but you need to be aware of classical

125:16

arrays in case you encounter them in the

125:18

wild

125:21

so jumping into our data analysis we're

125:23

going to be focusing with the data set

125:25

that we've been focusing on before and

125:26

I've hidden any columns that I don't

125:28

feel are relevant for our future

125:30

analysis that we're going to do anyway

125:32

the first thing we're going do is find

125:33

the unique job titles and for this we

125:36

can use the unique function and this

125:39

Returns the unique values from a range

125:41

or array so the first thing we need to

125:43

do is actually put in the array itself I

125:46

don't want to actually select this

125:47

column A because I don't want this job

125:48

title short to appear so I'm going to

125:50

select A2 and then press control shift

125:53

down to select all the way to the bottom

125:56

I'm going close this parenthesis and we

125:58

have all the different unique titles in

126:00

there now I want to get the sorted job

126:03

titles out of this so as you guess we're

126:05

going to use the sort function and for

126:08

this all we really need to do is specify

126:09

the array now if you notice from this

126:12

one whenever I went ahead and selected

126:13

it it specifies that R2 pound and that

126:17

basically says that hey there's an array

126:20

basically formula inside side of R2 we

126:23

want to extract all the contents of that

126:25

using R2 pound and so that's going to

126:29

work to be able to provide us all those

126:30

values and then it's going to sort it in

126:32

this case we have it sorted in

126:33

alphabetical order one thing I haven't

126:35

called out both times is these are once

126:36

again dynamic or modern arrays you can

126:39

see that gray box around each of these

126:42

but just to show this also works by

126:44

specifying R2 to R11 it's going to

126:48

provide us the exact same results but I

126:50

really like the shorthand nomenclature

126:52

of the R2 hashtag

126:56

sign now we're going to get into

126:58

calculating the median salary and if you

127:02

recall back to our last lesson on

127:04

statistical functions we went through

127:06

and calculated the average salary for

127:09

each of these job titles using an

127:12

average IF function but as it discussed

127:15

last time when comparing something like

127:16

the average to the median the average in

127:19

this data set is slightly higher due to

127:22

those basically outliers of those High

127:25

salaries around 960,000 so we want to

127:28

use median so what are we going to be

127:32

eventually calculating and now that's

127:34

this table right here where we sorted

127:37

our business or job titles themselves

127:40

and then we go into actually calculating

127:42

the median salary based on these

127:45

different job titles from our data set

127:47

now there's a pretty complex formula

127:50

going into here so because of this we're

127:53

actually going to break it down step by

127:55

step by step going through each columns

127:58

explaining how this actual process works

128:00

in order for us to get to this final

128:02

value for this we're going to be doing

128:04

it for data analyst only as we can see

128:06

the final value we're going to get to is

128:07

990,000 which over here which our final

128:10

results 90,000 so I'm going to go ahead

128:12

and delete this to actually start with

128:14

now we need to look for two separate

128:16

conditions the first one we need to look

128:19

to find do the job titles here actually

128:23

match up to this value here of data

128:27

analyst and this provides booing values

128:30

back where we get to this value down

128:32

here for true as expected in row 16 we

128:34

have data analyst now if we scroll down

128:37

further we can see that our next data

128:39

analyst job doesn't have a salary for IT

128:43

these type of things will throw off our

128:45

final median function that we're going

128:47

to actually be calculating and so we

128:49

need to basically filter it out as well

128:51

well so with the salary dat data set

128:54

selected I'm going to then go through

128:55

and filter this basically not equal to a

128:59

blank value and as expected we're

129:01

getting false values for these blank

129:03

ones now similar to what we saw in the

129:05

intro in arrays where we were

129:07

multiplying different arrays together

129:10

we're going to do the same thing here

129:11

with these bolean values for this I'm

129:14

taking that formula and wrapping it in

129:15

parenthesis it needs to be in

129:16

parenthesis in order to execute properly

129:19

for the we contains that analyst and

129:22

then the second condition that the

129:23

salary can't be blank whenever we

129:26

multiply these two Boolean values

129:28

together we get returned back either a

129:31

zero or one and the only way we get a

129:33

one back is if both these values are

129:37

true which is the condition we want to

129:38

meet now for zero or one values we can

129:42

actually see if we did an if statement

129:44

here if we did a logical test of zero

129:47

what is it going to return whether true

129:49

or false so for Z returns false and for

129:53

one we'd expect to turn true anyway we

129:55

don't want to necessarily return true in

129:57

this case we want to return the salary

130:00

that corresponds to that Row in the data

130:03

set so I'm going to go ahead and delete

130:05

this so for this we're going to start

130:06

with that if function itself then I want

130:09

to place all the different contents that

130:11

we saw in that previous V column now we

130:13

want to return the salary which are

130:15

these contents right here so I'll be our

130:18

value if true and then if false we just

130:20

want it to be FAL false which we can

130:22

just leave blank so now scrolling down

130:24

we can see that we have nothing but

130:26

those values for data analyst scroll

130:28

over just to confirm 129 yep that

130:30

analyst all right the last step we need

130:33

to go ahead and put inside of our median

130:35

formula all those contents that we had

130:38

before that entire if statement itself

130:41

to evaluate so that array that it's

130:42

going to basically find out for all

130:45

those salary for data analyst and it's

130:47

return back the median salary now this

130:49

also works for other function so let's

130:52

say we wanted to use the mode we want to

130:55

use a mode if condition they don't have

130:57

this available so we could just plug

130:59

this inside of mode and then running

131:02

this we can see that well the most

131:03

common value for thata analyst

131:05

apparently also the median of 90,000 so

131:07

going back to our data sheet let's

131:09

actually go through and stepbystep

131:11

calculate it for each of these different

131:14

sorted unique job tiles that we did

131:15

previously and we're going to be

131:17

building this step by step how i'

131:18

normally build a for so the first thing

131:20

we going to look for the job titles

131:21

itself do they match to that business

131:23

analyst so selecting column A2 and then

131:26

control shift down to select all the

131:28

contents on the cell we want to see if

131:31

that's equal to this business analyst

131:34

rule right here and now remember we're

131:35

going to be dragging these Downs do an

131:37

autofill so we need to be particular

131:39

about how we lock these cells

131:41

specifically we do need to lock these

131:43

values right here and just for safe

131:45

measure I'm going to lock the column of

131:48

this one okay pressing enter all right

131:51

we we have our array back looking for

131:53

business analyst and we can see that

131:55

it's working by what we see down here in

131:57

row 84 so let's actually do that array

132:00

multiplication by now filtering out

132:03

salary that doesn't have values or

132:06

blanks so we're going to put another set

132:08

of parentheses next to it we'll put in

132:10

our salary data and that it's not equal

132:13

to blank running this we confirm that

132:16

the first value of business analyst that

132:19

has a salary has a one now we need to

132:21

wrap this all inside of an if to

132:24

basically return instead of that one we

132:25

want it to return the salary itself so

132:29

for the value of true I'm going to put

132:30

in the selection of the salary yearly

132:33

running this we confirm this is again

132:36

correct looking at row 180 almost done

132:39

just now need to wrap this all inside of

132:42

a median function and Bam

132:46

85,000 and hopefully we actually locked

132:50

all the cells properly dragging it Down

132:53

Bam looks like we got all our things and

132:55

we slightly messed up our formatting

132:57

here so I'm going to go ahead and put a

132:59

thick outside border on again to make

133:01

that right again all right so that's how

133:03

you basically transform any function in

133:07

Excel that doesn't have that you know

133:09

count if or average IF function or

133:12

capability into other

133:16

functions now moving into probably the

133:19

most complex example that we're going to

133:20

be be using not only this lesson

133:22

probably in the entire course so if you

133:24

get around this you're going to be good

133:26

to go for the rest of the course anyway

133:28

what we're trying to look at here is the

133:30

count of job postings based on the month

133:34

that it was posted in and we're going to

133:37

be using the sum product function for

133:40

this now sum product is not anything

133:43

that you should be afraid of basically

133:44

before we were doing whenever we were

133:47

doing the intro and we were talking

133:48

about array multiplication how went

133:51

through line by line based on this and

133:54

we have our values of 1 4 9 16 and 25

133:57

line by line well if we were to do the

134:01

sum

134:02

product of the values in column A along

134:06

with the values in column B we're going

134:08

to get 55 which when we look at the sum

134:13

of these values here we can see that it

134:16

is 55 so it's a sum of the product of

134:22

the arrays so getting back to our

134:25

example that we're going to be solving

134:26

we're trying to aggregate it by these

134:28

names of these months if we actually

134:30

scroll over to the data set itself the

134:33

job posted date is in a date time format

134:37

so similar to the last example I'm going

134:39

to be walking you through column by

134:41

column by column on how we get to this

134:44

final value that we're going to be

134:46

eventually putting into our table here

134:49

to thus calculate these values for the

134:51

counts per month so we go ahead and

134:53

clear these cells to start and we're

134:54

going to start first by we want to

134:57

extract out the month from this job

135:00

posted date column so for this we can

135:03

use the text function which we're sort

135:06

of jumping ahead because we'll be doing

135:08

text functions upcoming lessons but

135:10

there a good little sneak peek anyway we

135:12

can plug in here something like a date

135:15

time value and then from there we wanted

135:18

to Output what is the format text for

135:21

well I know that if we do three M's it's

135:25

going to provide me the shorthand month

135:28

of this additionally if I do fourms it's

135:31

going to provide me the lonand month of

135:33

this and there's a host of different

135:35

format codes that you can provide sh by

135:37

this table here when I'm looking it up

135:39

in something like perplexity that says

135:41

that hey if you provide certain things

135:42

like if I provided a Double Y it's going

135:45

to provide the two-digit year and so on

135:48

for other values you look this up in

135:50

something like chat GPT anyway get back

135:52

to this example itself I want to

135:53

actually autofill this all the way

135:55

through it's not around any other

135:57

columns that I can actually autofill all

135:59

the way down and I don't want to sit

136:00

here and drag it all the way so what I

136:02

can do is select the column itself and

136:04

then when it has these basically four

136:06

arrows I can then drag it where I want

136:08

I'm going to drag it right next to here

136:10

and then now actually autofill it all

136:12

the way down now that I have it complete

136:14

I'm just select this column again make

136:16

sure I have those 4 hours again and drag

136:18

it back to the column it needs to be now

136:20

seeing what you did here you probably

136:21

like Luke can't you use something like a

136:23

count if in order to calculate the

136:25

months now using this and you'd be

136:28

correct with that remember call back for

136:30

the count if s we can provide a criteria

136:33

range in this case we're going to

136:34

provide it column V and then for the

136:37

criteria itself will provide the actual

136:40

month and then actually dragging and

136:42

dropping this all the way down once

136:44

again my formatting got messed up so I'm

136:46

put that thick outside border back on

136:48

there anyway these values here for what

136:50

we're going to get finally are the same

136:54

and so you really could stop this lesson

136:56

right here and if you want to do this of

136:58

creating a new column and then just

136:59

using count ifs you can do that but this

137:02

is a lesson on arrays so we're going to

137:04

get more complex with this in order how

137:06

to use the arays in order to actually

137:08

calculate this without having to create

137:10

these extra columns so I'm going to go

137:11

ahead and hide this cuz we're not going

137:13

to use it so before we can actually

137:15

summing up we need to get an array of

137:17

all the values that we'll say equal to

137:19

January so so we'll start by creating

137:21

that text function it's going to be

137:22

slightly different before cuz we're

137:24

going to be making it out of array we

137:25

want to actually select all the values

137:28

from H2 all the way down to the bottom

137:31

we want to then go ahead and lock it we

137:33

want it to be evaluating for that long

137:36

month name so four lowercase M's and

137:40

when I want to check if it's equal to in

137:42

our case we're looking for January we'll

137:44

look up here at this U2 or U1 I got a

137:47

typo up there update that to U1 anyway

137:49

we now have okay that this value is true

137:52

right here and we can tell from row 11

137:54

that this is in January it is true so

137:57

it's working out just fine so now if I

138:00

tried to actually run a su product which

138:02

is what we're finally trying to do on

138:05

all the contents of this array itself

138:08

we'll do W2 uh hashtag we're going to

138:11

get back zero because this isn't in the

138:14

format that we want we actually need to

138:15

convert this unfortunately although it

138:18

is on the back end is zero and on the

138:20

actual functions themselves can't

138:22

actually calculate it so we can do this

138:25

by basically converting it and the first

138:27

thing we can do actually is just we'll

138:29

put one negative sign and then I'll put

138:32

in that W2 hashtag and what this does is

138:35

it negates the Boolean values so

138:37

basically true which is normally a one

138:41

it negates it and makes it negative one

138:43

zero a negative Z is negative anyway we

138:46

need to actually apply two negative

138:48

signs CU we don't want it to be negative

138:50

one we want it to be positive one so

138:53

doing this one more time we now have

138:55

positive ones in there so now we are

138:57

using some product because some product

139:00

I feel are better with arrays but we

139:02

could use in this case where it's a

139:04

single array we could use actual just

139:08

sum itself I didn't want to show that

139:10

and we get that value of 3102 which

139:13

correlates to what I expect as the value

139:15

but we're going to use some product

139:17

because as you'll find out in future

139:19

lessons we're actually going to be

139:20

modifying it even further what's inside

139:23

of here and so we need this Su product

139:25

in order to do those anyway we get the

139:27

same value of

139:29

3,12 so going back to our data tab let's

139:32

actually calculate this fully for all of

139:35

these different values walking through

139:36

it step by step by step as we do

139:38

previously we're going to start with our

139:40

text function and we want to look at

139:43

that job posted date column I'm going go

139:46

ahead and lock all those cells it's very

139:47

important for this going be dragging and

139:48

dropping that down and remember for the

139:50

format text to this we want it to be

139:52

four lowercase M and in this we're

139:56

checking whether it's equal to this

139:59

value here of V2 which is January and

140:01

I'm going to go ahead and actually lock

140:03

just that column pressing enter to make

140:05

sure it goes correctly yep we got True

140:07

Value here for our row 15 value first

140:10

thing we want to do is do that double

140:13

negation which we need to actually wrap

140:15

these in this whole formula itself in

140:18

parenthesis in order to get our Z and

140:21

one values and then finally we're going

140:23

to wrap this all once again in Su

140:26

product putting that closing parentheses

140:29

on there pressing enter get 3102 and

140:32

then doing autofill all the way down we

140:34

have all our values once again format is

140:37

messed up I'm going put that thick

140:38

outside border now the other reason why

140:40

we're using some product in this case is

140:43

because in older versions of excel

140:45

before we had these uh modern dynamic

140:48

arrays some is not going to to be able

140:51

to work over a raise and you actually

140:52

have to use some product so this allows

140:55

us also to have a safe way to

140:59

calculate using arrays and then give it

141:01

to people that may be archaic and have

141:04

older versions of excel all right it's

141:06

your turn now to jump into some practice

141:08

problems to get more familiar with

141:09

working with arrays inside of formulas

141:13

in the next lesson we're going to be

141:14

getting into probably I think one of the

141:16

most funnest types of functions lookup

141:19

functions like vlookup and X look up and

141:21

things like that which are super helpful

141:23

for data analysis all right with that

141:26

I'll see you in the next

141:30

one lookup functions are one of the most

141:34

I'd say funnest functions whenever

141:37

you're learning to be a freak in the

141:38

sheets specifically we're going to be

141:40

focusing on three different lookup

141:42

functions vlookup H lookup and X lookup

141:46

V and V lookup stands for vertical H and

141:49

H lookup stands for horizont and x and x

141:51

look up just uh they wouldn't be

141:53

different in order to learn about these

141:54

functions we're going to be performing

141:55

some data analysis and if you recall

141:57

back from our math and statistical

141:59

functions lessons we found out what the

142:02

median Min and Max salaries were but for

142:06

the things like the Min and Max what

142:08

were those different job postings that

142:11

correlated to that well based on the

142:13

structure of our data set we can use the

142:14

vlookup and also X lookup functions in

142:18

order to find this out now because of

142:20

the structure of our data we're going to

142:22

have to do something different in order

142:23

to implement H lookups and for this

142:26

we're going to be able to get out or

142:28

extract out horizontal type data we're

142:31

going to basically transpose it into a

142:33

vertical format using H lookup but if

142:35

there's anything you remember from this

142:36

lesson it's that of X lookup this one is

142:40

the most dynamic and flexible and how it

142:42

can be used and we're going to be doing

142:43

in a final example using this in order

142:46

to bucket our salary data set allowing

142:50

us to categorize it into different

142:52

ranges and whether it has data or not

142:56

all using xlup for this we can start

142:59

using the lookup functions workbook we

143:02

have two main tabs in this data and

143:04

dataor 2 Data ones where we're going to

143:07

start in first for this section on

143:09

vlookups so for this we're going to be

143:11

using that job posting data set I've

143:12

hidden any unnecessary columns and we're

143:14

going to be filling in this table right

143:16

here so what I'm trying to do with this

143:18

is fill in based on this Min as you can

143:21

see the formula for Min the formula for

143:23

max and the formula for median where we

143:25

actually calculate this from the Sal

143:27

year average column we want to then

143:29

extract out based on these values the

143:31

company name a job title associated with

143:34

it and then the country associated with

143:38

it so we're going to start with V lookup

143:41

first and V lookup looks in a vertical

143:44

type format specifically it says it

143:46

looks for a value in the leftmost column

143:48

of a table and then returns a value in

143:51

the same Row from a column you specify

143:54

so for the first value of this we want

143:56

to provide the lookup value in this case

143:58

we want to look up 15,000 from that

144:01

salary year average column then from

144:03

there we need to provide the table array

144:05

now remember for this it needs to be the

144:07

leftmost column of the table and we want

144:11

to get columns M and O I'm going to

144:12

select column o because if we start at M

144:15

and try to go down it's going to mess up

144:17

cuz there's blank in it so I'm just

144:18

going to do control shift over and then

144:20

control shift down to select all the

144:22

data and then change this a column to M

144:25

instead the next thing we need to

144:27

specify is the column index number and

144:30

right now we're in column M so that

144:33

would be the First Column so MN o we're

144:37

in the third column you can imagine if

144:39

we have a buttload of columns what kind

144:42

of problems are going to run into so

144:44

we'll get to that when we get to it okay

144:46

now they have a range lookup we're going

144:48

to leave that blank for the time being

144:50

we're just going to execute this formula

144:52

as is and for this we're getting an NA

144:55

error if we actually click into it value

144:57

not available error and why is that well

145:01

if we actually go back to that vlookup

145:02

function in the definition that it

145:05

provides for it the last statement is by

145:07

default the table must be sorted in

145:10

ascending order right now our salary

145:14

values are not sorted so it's having

145:16

issues going through it and actually

145:18

finding that 15,000 because it's

145:20

unsorted anyway we're not going to

145:22

actually sort that table that's going to

145:24

be too much work we can actually now go

145:26

into that fourth parameter of range

145:29

lookup and instead of doing an

145:31

approximate match which was the default

145:33

we're going to do an exact match by

145:35

providing false in that case we find

145:37

that net two Source Inc is the company

145:39

name of the job with 15,000 now I want

145:43

to autofill for this but we need to

145:45

actually lock some cells real quick so

145:47

I'm going to lock this right now by

145:48

pressing F4 then from there we'll drag

145:50

it down now one thing to note on vlookup

145:53

X lookup and also H lookup this is just

145:55

going to return the first value so in

145:57

this case of this 115,000 it says it's

146:00

Volt Technical Resources however I do a

146:02

contrl f of

146:04

115,000 we'll find that yes it's at row

146:07

19 for Volt Technical Resources the

146:09

first one that provides but it's also in

146:11

row 42 with northr Gan so it's only

146:14

providing that first match now what

146:17

happens if we wanted to next get things

146:19

like the job title or the country itself

146:23

well if I were put in the first two

146:24

values the lookup value and then the

146:26

table array what will we put for the

146:28

column index number remember in vlookup

146:31

the leftmost column of the table itself

146:34

is what we're going to be looking up but

146:36

however columns A and K are even well

146:39

more left of that table so unfortunately

146:42

we can't use vck up for this but we will

146:45

be using X lookup for this that's why

146:47

I'm going to recommend it over vlookup

146:49

but I think you guys start at the Bas

146:50

phics

146:52

first however before we get into that

146:55

we're going to now shift gears and cover

146:57

H look up in order to look up values in

147:00

a horizontally oriented table this case

147:04

this is horizontally oriented because we

147:06

have things like the months across the

147:09

Horizon if you will and then we have in

147:12

the columns in the column standpoint we

147:14

have the job titles of the different

147:16

ones of data analyst and your data analy

147:18

so on now the data in this table is

147:21

calculated using the data from the data

147:24

tab in order to get the counts of months

147:26

and you've previously seen this in the

147:28

last lesson where we went in that hard

147:31

example of some product where we now go

147:33

through and do some array multiplication

147:36

in order to find out the different

147:37

counts for the job titles based on a

147:40

month anyway for this H look up we want

147:42

to look up based on a month what is the

147:46

associated job count for a specific job

147:49

type

147:50

so let's say we want to just look at

147:52

that may column well we can put in h

147:55

lookup and this looks for a value in the

147:59

top row of a table or array of values

148:02

and Returns the value in the same column

148:04

from a row you specify so only selects

148:08

from that top row for this we provide a

148:11

lookup value in this case let's say

148:12

we're looking up January then from there

148:15

we provide the table array itself we can

148:19

go and just select this data now I could

148:21

technically I could select all this data

148:23

because it's just going to go to the

148:25

associated column associated with this

148:28

so that we included row a doesn't really

148:30

matter then from there we want the row

148:33

index number what value do we want from

148:37

this January do we want data analyst

148:39

senior data analyst senior data

148:40

scientist so we can just count down what

148:42

we want we'll start with data analyst

148:44

first so we'll put in that's the second

148:47

row in this so let's try to enter this

148:50

and for this we get 753 which if you go

148:53

back to this we're doing Jan A1 through

148:56

M7 and then the second one so why are we

148:59

getting

149:00

753 well once again this has to do with

149:04

the range lookup we're doing an

149:07

approximate match similar to vlookup it

149:11

expects that these values for that top

149:14

row are in in this case alphabetical

149:17

order in order to perform that

149:19

approximate maass

149:20

these aren't in alphabetical order

149:21

they're actually in chronological order

149:23

so instead we need to specify false now

149:27

running it we get the correct value of

149:29

982 now we can also apply this to a

149:31

situation where maybe we want to

149:33

transpose these values into this new

149:36

table that we have here on month and

149:38

count and then up here I'm going to also

149:40

just specify what we're looking at we're

149:41

going to look at data analyst now with

149:44

our H lookup we're going to be providing

149:46

that lookup value the table array and

149:49

then the row index number say if we

149:52

wanted to go in here instead of data

149:53

analysts we wanted to look at data

149:56

engineer instead how can we get this to

149:59

update well we can use another function

150:02

for this specifically we can use the

150:04

match function for this and this Returns

150:07

the relative position of an item in

150:09

Array that matches a specified value in

150:12

a specified order so in this case I want

150:15

to look up data engineers in the array

150:19

from a 2 to A7 it's providing me a one

150:23

cuz it's not it's doing the approximate

150:25

match again once again they're not in

150:27

alphatic so I have to specify exact

150:30

match using zero okay and now I get data

150:33

engineers in the fifth place I'm also

150:35

going to move this column over and make

150:37

this a little bit bigger going back into

150:39

that H lookup that we're going to use

150:40

for this we're going to provide that

150:41

lookup value which we want to actually

150:44

lock by pressing F4 then we're going to

150:46

provide the table once again I said you

150:48

can select that a column if want or not

150:50

we're going to lock all these values as

150:53

well because we'll be dragging it down

150:55

from there we'll be providing the row

150:57

index number which we've calculated

150:59

right here in this P based on that match

151:01

that we're performing want to lock this

151:03

as well and as far as the range lookup

151:05

well we want to do exact match running

151:08

this we get an NA error because I was

151:10

silly and the lookup value we want to

151:12

actually do is for the month of January

151:14

not the data engineer actual lookup

151:17

confusing this with h lookup sorry about

151:18

that so we'll put in 03 for this instead

151:21

and then running it and now we're

151:22

getting back to 236 which is not that

151:24

Engineers thing we're one off and this

151:27

has to do with how we did our match up

151:30

here which specified A2 to A7 basically

151:33

we're counting down from the second one

151:36

where in h lookup we included all the

151:39

way up to that first row so this is just

151:42

a simple fix by changing this one up

151:43

here to A1 and now our values update

151:46

appropriately and then I can go ahead

151:48

and just drag and drop this all the way

151:50

down and once again going to get into

151:52

some troubleshooting because this is all

151:54

the same values and that's because I

151:56

fully locked this actual month number

151:59

and instead I wanted to press F4 and

152:01

only lock the column of O now finally

152:06

getting to the final answer we have it

152:09

and we can confirm this that data

152:11

Engineers should have 396 on the

152:12

December value that's correct and now we

152:15

can do things like this where I can go

152:16

in and say hey instead I want to look at

152:19

Dana analyst and it will update for this

152:22

instead now once again with h lookup we

152:25

run into issues like vlookup if there's

152:28

values Above This top row I can't really

152:31

think of that any applications that

152:33

that's applicable in this but it is a

152:34

limitation anyway this is why we're

152:36

going to be shifting to the next

152:40

topic and that is using xlookup to now

152:43

based on these salaries that we were

152:46

previously trying to identify

152:47

identifying a job title and a country

152:50

associated with it so what is the

152:53

definition of xlup and this searches a

152:56

range or an array for a match and

152:59

Returns the corresponding item from a

153:02

second range or array by default an

153:06

exact match is used that's pretty

153:08

awesome considering all the issues ran

153:10

to with h look up and V lookup anyway

153:13

instead of using a single table we're

153:14

going to be using multiple ranges for

153:16

this let's get into it first we're going

153:18

to provide the lookup value which in

153:20

this case is 15,000 and then we want to

153:23

provide the lookup array so we need to

153:26

select this entire M column here for

153:28

what we want to actually look up but we

153:31

have these blanks in here so I'm going

153:32

to just do a trick of selecting the O

153:35

column selecting all the way down and

153:38

then from here I'm going to just go in

153:39

and actually change these values to M

153:42

instead now we want this to remain the

153:45

same so I'm going to press F4 to

153:47

actually lock this now that was our

153:49

lookup array now we want to get into

153:52

what return array or where we want

153:54

actually look to see and that's to the

153:56

left of this in this job title short

153:59

column these arrays have to match up in

154:02

where they are uh where you're selecting

154:04

them so in this case I selected over

154:05

here in the second row I need to do the

154:07

same for the job title then from there

154:09

pressing control shift down I select all

154:12

of them once again I'm going to lock all

154:14

of these by pressing F4 now let's close

154:16

the parentheses and go ahead and execute

154:18

it we can see see that data engineer is

154:21

the lowest paid salary with this 15,000

154:25

now we can also add in this default

154:27

parameter in case you can't find a value

154:30

you can put not found but in our case we

154:32

made or we calculated this minmax and

154:34

median from our data set so technically

154:37

this isn't really necessary anyway let's

154:39

see what the other job titles are for

154:41

these Max and median looks like it's

154:43

data scientist and then the data

154:45

engineer for the median which is that

154:47

first one that appears right over here

154:48

in row 19

154:50

now doing the same for the country I'm

154:52

going to go ahead and just copy and

154:54

paste that formula in that we had from

154:56

the other cell and I'm going to just

154:57

adjust this now to use column K instead

155:01

of column A for the actual return array

155:05

okay with that updated press enter and

155:07

we can see Brazil has the lowest one and

155:11

what is the highest one United States

155:12

and also the median United

155:16

States all right we're going to crank

155:18

this up a notch and now we're going to

155:20

jump into actually bucketing our salary

155:22

using X lookup specifically I want to

155:25

use this table that I've created in

155:28

order to properly categorize different

155:32

values based on this so in this case we

155:34

have this value of 140,000 it's going to

155:36

fall into our bucket of

155:38

125,00 th000 there's no data in this one

155:40

so I want to say no data this one's

155:42

greater than 200,000 so I want to say

155:44

greater than 200,000 so for this we're

155:46

going to be creating a new column column

155:49

Q and we're going to call it salary year

155:53

bucket I'm going to go ahead also and

155:55

cod this column o for the time being we

155:58

don't really need this for this now

156:00

technically you already have the

156:02

requisite knowledge in order to bucket

156:03

it I could put in a nested IF function

156:08

similar to below and it has 1 2 3 four

156:12

five if you will nested ifs to go

156:14

through and basically check each of the

156:16

different values as it's going through

156:18

in order to bucket it appropriately

156:20

in this case it correctly categorizes it

156:23

and then if I wanted to I can drag and

156:25

drop it all the way down but now this

156:28

sheet is filled with all of these nested

156:32

if statements this is really going to

156:34

slow your spreadsheet down so I don't

156:37

recommend doing this also building

156:39

something this like this you've now

156:41

hardcoded in your values into it and

156:44

what is if you want to change this later

156:45

you'd have to update all your formulas

156:47

it's a mess don't recommend doing it so

156:49

I'm going to select all this control

156:51

shift down and then just delete it all

156:53

instead we're going to be using X lookup

156:56

for this specifically we need to look up

156:58

the lookup value which is going to be

156:59

the same one that we did before that M2

157:02

and then we want to look up the lookup

157:04

array now I conveniently made this table

157:05

here that it's providing values at if

157:08

you will the higher end of the bucket so

157:11

we're not going to necessarily do an

157:12

exact match for this we'll get to that

157:14

in a second anyway now we want to look

157:16

at what do we want to return the return

157:18

array which is on the left side of this

157:20

table that's the values I actually want

157:22

to return back in that column value if

157:24

not found is not necessarily applicable

157:27

so now getting into how we're actually

157:29

going to match based on these salary

157:30

buckets based on these values

157:32

highlighted in this T column right here

157:35

well we need to do not exact match we

157:38

need to do exact match or next larger

157:42

item and this is the value of one

157:45

basically in this case of this 12850

157:48

it's going to look for initially an

157:50

exact match of 128 of 50 and it's going

157:53

to see that nothing's there so then it's

157:55

going to look for the next larger item

157:58

which is that 200,000 so therefore it's

158:00

going to return as we're going to find

158:02

out the

158:03

125,000 to

158:05

200,000 now I can try to drag and drop

158:07

this down but I'm going to run into

158:08

errors because I didn't lock my formulas

158:10

correctly so I need to go back in lock

158:13

that s column with F4 and lock that t

158:16

column with F4 and then I'm just going

158:18

to autofill all the way down and now we

158:21

have all of our different job postings

158:24

bucketed into these different salaries

158:27

so instead I wanted to go through and

158:28

actually change this to be

158:30

150k and then match this to

158:33

150,000 I go do it and it would update

158:35

appropriately I also need to update this

158:37

column as well but now it all updates

158:39

and it's in one single location so this

158:41

is really the power of using that X

158:44

lookup over the ifs in order to perform

158:46

this type of bucketing all right you now

158:48

got some practice problems go through

158:50

and get more familiar with using these

158:52

different Lookout functions as I said

158:54

before make sure you're prioritizing

158:56

understanding that X lookup it's the

158:57

most powerful but the one caveat to X

159:01

lookup is that it was introduced around

159:04

the 2020s so anybody using once again an

159:06

archaic version of excel Beyond or

159:10

before this year they're going to have

159:11

compatibility issues using this so

159:14

that's why you need to also be familiar

159:16

with that V lookup and also H lookup are

159:19

going to encounter them in the while all

159:21

right with that I'll see you in the next

159:22

one where we're jumping into text

159:27

functions now I know this is a course on

159:29

data analysis but text functions are

159:32

actually imperative for performing

159:35

analysis on Text data and for this we're

159:39

going to be working in this lesson on a

159:41

data set of job applicants and we're

159:44

going to take it a step further using

159:47

text functions in order to analyze

159:50

specifically for our final analysis we

159:52

have information on the different skills

159:55

that each one of these job applicants

159:56

knows so we're going to be able to

159:59

perform an analysis to see what are the

160:01

most common skills from these applicants

160:04

but before we get to that final analysis

160:06

we first need to beef up our knowledge

160:08

we're going to focus on three main areas

160:10

the first one is text combination we're

160:12

going to be working to combine different

160:15

columns into a single column from there

160:17

we'll move into the second one of of

160:19

text extraction being able to out of a

160:22

single column extract multiple values

160:25

and finally in the third one performing

160:28

some sort of text search in order to

160:31

also extract out in this case we're

160:33

going to be extracting out the state

160:36

name from an address that contains a

160:39

city state and area code so for this you

160:41

can start up by opening up the text

160:42

functions workbook and in the data tab

160:45

we have this data set which you haven't

160:47

seen before it's only about 20 R and

160:50

includes a list of job applicants now

160:53

we're not using the full data science

160:54

job posting data set because a lot of

160:56

the examples we're going to do in this

160:58

it would be basically bogged down your

161:00

Excel spreadsheet so especially how

161:02

we're going to be implementing these

161:04

it's really meant to be used for smaller

161:07

data sets you may be like Luke what ends

161:09

if a bigger data set and need to clean

161:10

up the text well that's where power

161:12

query comes in which we'll be covering

161:14

in the advanced chapters so stick around

161:16

for that

161:20

anyway moving into text combination we

161:22

want to Target these columns right here

161:24

f and g we want to combine them into one

161:27

line to have a single address so I'm

161:30

going to go ahead and hide this column H

161:32

for the time being we're going to be

161:33

putting that full address in column J

161:36

and this one's pretty simple all we're

161:38

going to do is text join which

161:41

concatenates a list or range of text

161:44

strings using a delimiter the first

161:46

thing I need to specify is the delimiter

161:48

how am I going to to separate that

161:49

street and the city state all I want to

161:51

do is a space so I'll do that en closing

161:54

it in double quotes next is ignore empty

161:58

basically if there was an empty cell in

162:01

here it would just ignore this and it's

162:03

not going to input multiple different

162:05

spaces between it just ignore it so we

162:07

want to in that case we're just going to

162:08

put in true the final one is text and we

162:12

can specify you could do text and then

162:14

comma and then text to um that's really

162:17

Vose I don't really like doing that

162:19

instead I'm just going to select the

162:21

range of F2 to G2 now we can see that

162:25

the address is fully concatenated and we

162:28

can drag it on down and it works for all

162:30

of

162:32

it now the opposite of combo is

162:35

extraction which we going to get into

162:36

next and in this case we're just going

162:38

to use a single column and extract out

162:41

multiple values in this case we have

162:43

this full name column we want to extract

162:44

out the first name and the last name go

162:47

ahead and hide these other columns we're

162:48

not using

162:49

in this case we're going to specify the

162:51

text split and it says it splits text

162:55

into rows or columns using delimiters so

162:59

we'll first start by specifying the text

163:01

which is B2 in this case and then the

163:03

column delimiter which in our case is

163:06

going to be that space once again we're

163:08

going to use that double quotes for that

163:10

space and then end Double quotes and

163:12

then this is going to be a dynamic array

163:15

and it has these two values here now

163:18

dragging this all down down we see that

163:20

it fills in for all these different

163:21

names now we just split text there also

163:24

could be cases where we maybe want to

163:26

extract out certain amount of values or

163:30

certain amount of text from a column in

163:32

this case we also have our application

163:34

ID number which is a combination of

163:37

letters and numbers but as you can see

163:38

from this there's some values in here

163:41

that are actually repeating sometimes we

163:43

want to refer to this the shorthand of

163:46

this and let's say we only want to get

163:47

the last three digits of the applicant

163:50

ID because we know that's always

163:52

different well in this case we can

163:54

specify the right function and it

163:57

Returns the specified number of

163:59

characters from the end of the text

164:01

string we specify the text itself and

164:04

then the number of characters in this

164:05

case we can just say three and it's

164:07

going to provide back that 548 we could

164:09

also just change that to include all the

164:11

text numbers in case this number gets

164:13

bigger than that and then go ahead and

164:15

drag it all the way down

164:19

now one last one before we get into

164:21

actually performing that analysis we

164:22

want to we want to go through and

164:24

extract out the state from this city

164:28

state and zip and as you notice from all

164:30

these they have a common format in that

164:33

the city has a comma and then the state

164:35

starts and then there's another comma

164:37

following that so we're going to be

164:39

using those basically delimiters if you

164:42

will in order to identify where we

164:44

should potentially extract out this

164:47

state value from where this state these

164:49

two L twetter value so the approach

164:52

we're going to use for this is as we go

164:54

through this is we're going to find the

164:55

location first of that first Common

164:58

space before this the next we'll find

165:01

where it actually ends and then finally

165:04

well using those values will actually

165:05

extract out using the mid function that

165:09

state abbreviation so the first thing we

165:11

need to do is find that comma and this

165:14

Returns the starting position of one

165:15

text string within another text string

165:17

so in this I'm going to specify that

165:19

we're the fine text we're going to be

165:21

looking for is the comma itself and

165:24

we're going to be looking at within text

165:27

obviously G2 now we also need to find

165:30

the second comma in this we can use that

165:33

find function again specifically we're

165:36

finding that comma specifying that

165:38

within text of G2 and now we have the

165:42

second optional parameter of start

165:44

number we want to start from nine which

165:48

is the first one we found this in

165:50

running this we get nine now the problem

165:53

here is because we're starting as the

165:55

exact number that the comma actually

165:58

starts that's why we're getting that

166:00

back that value of nine we need slightly

166:02

actually bigger than nine but anyway

166:04

we'll fix that in a bit instead let's

166:07

actually get into extracting out that or

166:09

at least trying to extract out that CA

166:12

of this value and then we'll fix that

166:13

issue in cell R2 so for this we're going

166:16

to be using the mid function which which

166:19

similar to that right function is

166:21

Returns the characters from the middle

166:23

of a text string given a starting

166:25

position and length so in this case we

166:29

want to extract out G2 and we'll provide

166:33

it the start number of well what's

166:35

valuable in Q2 and the number of

166:39

characters and for right now we'll just

166:41

put in we know we want to extract out

166:43

two so we're going to put in two now

166:45

we're running into issues we're only

166:47

getting back a comma if will and if we

166:50

actually make this longer to actually

166:51

zoom in on here we get commas space CA

166:55

now when providing four and this has to

166:57

do with right here this value on the

167:00

start number isn't correct this nine

167:03

right here is exactly at the comma we

167:06

need to actually specify for that start

167:09

number of where the C is and these are

167:13

all two spaces over so I'm going to come

167:15

in here and I'm just going to modify

167:17

this shortly and add two to this this is

167:20

also going to fix our previous one that

167:23

we had when finding this of 13 because

167:26

13 now has all the way over and then

167:29

finally that mid is fixed we can change

167:31

this now to back to two now you know me

167:34

I don't like hardcoding values something

167:36

like this to and really what we're doing

167:40

here is we're doing adding two based on

167:42

the length of the comma and then the

167:47

space after it so there's two characters

167:49

in there two this is still that 11 value

167:52

that we saw before similarly inside of

167:54

our mid function I don't like doing this

167:57

two here because States maybe could be

167:59

more than two so I don't want to hold it

168:01

necessarily to that so instead I'm going

168:03

to do R2 minus Q2 which in our case is

168:08

going to be two and we have California

168:11

all right now we can take all the

168:12

different values actually drag it on

168:14

down and we get all of our states

168:16

extracted from this

168:21

all right diving into our final analysis

168:22

we're actually combine all of these

168:24

different functions we just learned

168:25

about specifically with this data set we

168:28

have this column H right here and it's a

168:32

list of different skills that each one

168:34

of these job applicants have we want to

168:36

combine this and aggregate this in order

168:38

to analyze the most common skills for

168:40

this we're going to have to walk through

168:42

four different steps in order to get

168:45

this into our final visualization that

168:47

we can actually visualize and see here

168:50

so I'm going to go ahead and clear all

168:51

these values so we can get started

168:54

actually doing this the first thing I

168:56

want to do is actually combine all of

168:59

these values into a single long text

169:02

string and we're already having the

169:05

separator of a comma and space between

169:07

each skill so we're going to use that

169:09

same separator to continue separating

169:11

this so using text join we're going to

169:15

first specify the delimiter of that

169:17

comma and a space it's asking if I want

169:19

to ignore those hidden or empty cells I

169:22

do and then finally we need to provide

169:24

the actual text itself so we'll go down

169:28

through and select in our data tab H2 to

169:32

h21 going back up into the formula bar

169:35

closing this parenthesis and then

169:36

pressing an enter look I have a like

169:38

slight typo in here I need to actually

169:41

put double quotes around both of them

169:42

you can't mix double and single quotes

169:45

now we have this super long list uh that

169:48

has all of our different skills in it it

169:49

looks like it's properly delimited now

169:52

that we have all these values in one

169:54

cell we can then use the text split

169:57

function to now separate this into

170:01

different cells because we're going to

170:02

want to then move into transposing it

170:04

next and for this once again the

170:06

delimiter we're using is that comma and

170:09

space running this we have all the

170:11

different values separated out by

170:13

different cells so now almost there we

170:16

need to get into making a table

170:19

right here basically having skills in

170:22

the left hand column and then the counts

170:24

of those skills from what's above here

170:26

so first thing we need to do is get the

170:28

unique values of this but if we just run

170:31

unique on that row six we're going to

170:35

run into an issue to where it actually

170:37

goes out to the right and actually

170:38

doesn't get the unique values for all

170:40

these so the first thing we need to do

170:42

is actually

170:44

transpose which moving it from

170:47

horizontal to vertic vertical of that

170:50

row six okay so it's now up and down all

170:53

the way now in this case we want to run

170:56

the unique function on this to extract

170:59

out all those unique values and

171:02

scrolling down looks like we have all

171:03

the unique values it does have a zero

171:06

because that we did that row six and so

171:08

when we get to these empty cells over

171:10

keep on scrolling over here it records

171:12

as zero I'm fine with that for the time

171:14

being and we'll continue last thing we

171:16

had to do is use basically a count if to

171:19

count these different skills based on

171:21

whether they appear or how often they

171:23

appear in this row of six so I'll type

171:27

in count if we need to specify the range

171:30

first and we'll do six I want it to stay

171:32

there uh as we're because we're going to

171:34

autofill down so I'm going to F4 that

171:36

and then from there specify the criteria

171:39

which is going to be a n Kafka okay so

171:43

three values for that one and then

171:45

dragging this all the way down bam got

171:47

this all filled in all right the last

171:49

thing we need to do is actually

171:50

visualize this cuz we want to visualize

171:53

these skill counts select the area that

171:55

we want we're going to go in and insert

171:57

in under recommended charts you can do a

171:59

bar chart but I'm more a fan of

172:01

horizontal bar charts especially when we

172:02

have text values and we need to be able

172:05

to see all the different names so I'm

172:07

going to have to expand that out a bit

172:09

and I'm going to change this title up

172:11

here just to something like skill count

172:14

of applicants and Bam now we can see

172:18

things some Trends out of this that a

172:20

lot of people are claiming to have

172:22

experience with data bricks which that's

172:25

unusually high there probably something

172:26

I want to investigate for this but a

172:28

good little thing that we actually can

172:29

analyze and see from this analysis that

172:31

we did one minor note I would normally

172:33

go through and actually sort this from

172:36

high to low and you can definitely do

172:39

this you'd have to copy and paste the

172:41

values over you wouldn't be able to use

172:43

these values right here and S sort and

172:45

filter them because we're using the

172:48

modern or dynamic array to find these

172:51

unique values so that's definitely an

172:53

option if you want to do and I

172:54

definitely would recommend you do

172:55

something like that before sharing some

172:57

sort of visualization like this all

173:00

right you now got some practice problems

173:01

to go through and get more familiar with

173:03

these text functions which like I said

173:05

are imperative for de analysis in the

173:07

next lesson we're going to be moving

173:09

into our last one in this chapter on

173:12

formulas and functions on date and time

173:15

functions with that I'll see you in the

173:17

next one

173:22

all right saving the shortest lesson for

173:25

last we're going to be focusing on date

173:26

and time functions and for this we're

173:29

going to be using that same data set

173:31

from that last lesson which is about 20

173:34

rows of job applicants now similar to

173:37

text functions we're not using that full

173:39

data science job data set that we've

173:41

been using previously because I find

173:43

it's not common to really use these date

173:47

and time functions on a large set of

173:49

data because it's going to slow down

173:51

your sheets so that's why we're using

173:52

this smaller data set for this once

173:54

again if we're needed to actually clean

173:56

up date and time stuff we're going to

173:58

use something like power query which

173:59

we're going to be getting to in the

174:00

advanced chapter anyway we're going to

174:02

be focusing on two main types of

174:03

functions first up our date functions

174:06

which going to be able to extract out

174:07

things like month day and year and then

174:10

from there we're going to transition

174:11

into time functions extracting things

174:13

out like hour minutes and seconds

174:15

finally we're going to move into that

174:16

final analysis looking at what is the

174:19

time that is most likely for applicants

174:21

to apply to jobs for this we're going to

174:24

be using the date and time functions

174:27

workbook and we'll be working in this

174:29

data sheet for this filling in certain

174:31

values as we go through this I'm going

174:33

to go ahead and hide some of these

174:35

unnecessary columns so we have more

174:37

space to work with

174:40

this anyway jumping right in if we want

174:43

to calculate what the month is we have

174:45

something like the month number putting

174:47

that in that's D2 similarly we can get

174:50

the day by using something like day and

174:53

once again providing it D2 then finally

174:55

something like year we can provide D2 We

174:59

Get 2023 now if I wanted to only extract

175:02

out of this date out of this date time

175:05

if I were to use this date function it

175:07

Returns the number that represents the

175:09

date in Microsoft Excel okay date time

175:11

code got it we're going to put in the

175:13

year so we need to provide the year

175:15

first month and then from there day boom

175:19

and analyzing this we see it is febru 14

175:21

2023 one quick refresher on how Excel

175:25

stores those datetime objects so right

175:28

now it's in as a the number format of

175:30

date if I change this back to General

175:33

it's going to shift to this number and

175:35

if we recall this stores the values in

175:38

it if we start at something like one

175:41

converting it to a short date we can see

175:43

that it starts at January 1st 1900 now

175:47

if you're working with dates before 1900

175:50

let's say we put in something like

175:51

negative 1 I converted it here to a date

175:54

it's going to just provide all these

175:55

different Amber Sands here there's a few

175:57

different workarounds for that that's

175:58

beyond the scope of this course main

176:00

thing to understand is how it's actually

176:01

stored within Excel anyway I'm going to

176:04

convert this back up into a date and for

176:08

each of these I want to actually fill in

176:10

the values all the way down bam all

176:13

right close up this home ribbon all

176:15

right next up is today say we needed to

176:18

today's date well I can put in the today

176:20

function this actually takes no

176:22

arguments and will provide us the date

176:24

I'm filming this on September the 3rd

176:27

now the last common function that I find

176:28

myself using all the time are when I

176:30

want to calculate the days since

176:32

something happen in this case we want to

176:34

find out how many days has it been since

176:38

they have applied to the job so we can

176:41

use the date diff function for this now

176:45

the one thing to note with this is I'm

176:47

typing it in there's no if I type in

176:49

just date there's no date diff in there

176:52

there's no documentation that Excel

176:55

natively actually includes for you to

176:58

use this so this is like a function you

177:00

just have to know about anyway it takes

177:02

three parameters basically the start

177:05

date that we want to start from the

177:08

reference date that we want to basically

177:10

subtract from this which is today we

177:12

want to actually go ahead and lock this

177:13

I'm going to lock this with F4 and we

177:15

want to provide this in the format of

177:18

days which we provide this text

177:20

character of D and this tells us it's

177:22

been about 567 days since Valentine's

177:25

Day in 2023 anyway updating all these

177:28

cells for this we now have this

177:32

data shifting gears into our time

177:36

functions as we can expect a lot of

177:38

these are going to be the same hour we

177:40

use hour function minute has a function

177:43

as well as second but this doesn't

177:45

really to show seconds but we can see up

177:47

here it is is actually included in your

177:49

data similar to the date function for

177:52

time we have to provide three parameters

177:54

of hour minute and then also second drag

177:58

and drop this all the way down we can

177:59

see that yep it's correlating correctly

178:02

one note for the hour that we previous

178:04

calculated this is in military time or

178:06

if you're in Europe you also do it this

178:08

way anyway I really like this for an

178:10

analysis purpose especially when we get

178:11

into analyzing it now conversely we can

178:14

also use for time and also date you

178:17

could use the text function which we

178:19

previously saw when we were extracting

178:21

out the month out of date Times by

178:23

providing a value and then the format

178:26

text which we're going to say in this

178:28

case is just hour hour minute minute if

178:31

I wanted that am PM format not that

178:33

military time format I can just add in

178:36

Here Am Pm and it converts it

178:39

appropriately dragging this all down and

178:41

then filling it in we get

178:45

it now moving into that final analysis

178:47

we want to analyze when are these job

178:50

postings Happening by hour of day the

178:53

first thing we need to actually do is

178:55

get a colum here of the hours in the day

178:59

so we can do some sort of like count if

179:00

on it in order to calculate that so for

179:02

this I'm going to use the sequence

179:05

function and I went 24 rows with it

179:08

column's going to leave blank and I want

179:10

to start at one and it's going to fill

179:12

down from 1 all the way to 24 and now we

179:16

need to run a c if basically for each

179:19

one of these conditions run down this

179:21

list basically matching to see what is

179:23

the hour for these things so I have it

179:25

hidden but I'm going to go ahead and

179:26

make column again for hour and I'll put

179:29

in here hour and unlike last time I'm

179:31

actually just going to put the whole

179:32

range in here and it's going to provide

179:34

me back it in a modern array now with

179:37

this I can actually now use this in the

179:39

count if we want to First provide it a

179:42

range which is our modern array so it's

179:44

going to do I2 hashtag and then a

179:47

criteria for the hour we want to search

179:49

for we want to search for that one A2

179:51

from here we want to fill it all in and

179:53

we have some reference errors because we

179:56

didn't lock our cells specifically we

179:58

didn't lock this cell right here this I2

180:01

so I'm going press f4 on that to

180:02

actually lock that then dragging it all

180:05

the way down we have it okay our last

180:08

portion of this is actually visualizing

180:10

this so we're going to go in select all

180:12

that data go to insert go to recommended

180:14

charts and I'm more of a fan of column

180:18

charts with this type of data so I'm

180:20

going to go ahead and put this in and

180:21

I'm going to change this to job postings

180:25

per hour and Bam now from this we're

180:28

seeing that basically people are

180:31

applying during normal working hours and

180:33

apparently they're waiting until the end

180:35

of the day to actually submit their job

180:37

applications maybe to get in before a

180:39

deadline or something all right this is

180:40

the last lesson on functions and

180:43

formulas in the next chapter we're going

180:45

to be moving deeper into understanding

180:47

how to actually make these different

180:50

visualizations I've only been showing

180:51

you a sneak peek at it right now to get

180:53

you familiar with how to easily create

180:55

it but we're going to go in into a lot

180:56

greater detail up coming up next now we

180:59

spent almost nine lessons on these

181:01

functions and it's because I feel

181:04

functions are one of the most important

181:05

things to understand about Excel because

181:07

it also transfers to other portions

181:10

specifically we're going to be learning

181:12

more about the Dax language in the

181:15

advanced chapter and we're going to

181:16

apply a lot of our knowledge that we

181:17

already know about these Excel functions

181:20

to Dax functions they're very similar

181:22

anyway you got some practice problems to

181:23

go through in work in order to

181:26

understand better how to use these

181:27

datetime functions and from there we'll

181:28

get into that chart chapter with that

181:30

I'll see you in the next

181:35

one welcome to this chapter on charts

181:38

and as much as I love using something

181:41

like python a programming language for

181:43

making

181:44

visualizations I feel that Excel has

181:47

some capabilities built into it that

181:50

allow it to basically exceed any

181:52

programming language and the

181:53

customization that you can do to charts

181:54

that we'll be finding out in this

181:56

chapter for this chapter we have four

181:59

lessons this lesson right here is an

182:01

intro to chart so we're going to be

182:03

focusing on understanding the basics of

182:04

using charts and specifically looking at

182:08

three types of charts specifically line

182:10

charts pie charts and bar or column

182:12

charts so technically that's four in the

182:15

second lesson we're going to move into

182:16

more advanced charts such as Scatter

182:18

Plots and also map charts along with

182:21

understanding more advanced

182:23

customizations that we can do to these

182:25

charts in the third lesson we're going

182:26

to go Harden the paint in order to

182:28

understand statistical charts

182:30

specifically histograms and then also

182:32

box and whisker charts which are

182:35

imperative to understand statistical

182:37

distributions of our data we'll finally

182:39

wrap this all up with a final lesson

182:41

focusing on spark lines which basically

182:43

allow us to put charts inside of

182:46

individual cells

182:48

in Excel pretty neat all right for this

182:50

lesson we're going to be using the

182:52

charts intro

182:56

workbook first thing to understand is

182:58

terminology Microsoft refers to all

183:02

these different visualizations diagrams

183:04

plots whatever you want to call it they

183:06

refer to it as chart basically they want

183:09

to use a safe term that encompasses all

183:11

the different type of visualizations we

183:13

can build with this so you may hear me

183:14

from time to time call this a plot or

183:16

visualization basically mean a chart

183:18

anyway why do we use charts well looking

183:22

these six examples here we can see some

183:24

different characteristics about this

183:25

data that we're looking at but what if

183:27

we looked at just the core data itself

183:30

which is this table right here looking

183:31

at what is the number of job postings

183:35

per month if we look at this visually

183:38

we're not able to see necessarily what

183:41

is the highest month and also what is

183:44

the lowest month I mean you can figure

183:46

out eventually but it's not easy to spot

183:48

and that's why charts are so powerful

183:51

and so I have a variety of

183:52

visualizations here in order to Showcase

183:55

that same table that we were just

183:57

looking at in basically a variety of

184:00

different forms here even have a few

184:02

below here down below it but we need to

184:05

understand which chart to use because

184:08

let's say we wanted to use this pie

184:10

chart here is that actually a good chart

184:13

to use to visualize this or instead

184:15

should we be using something like this

184:18

line chart to better show a trend over

184:20

time while also showing a magnitude of

184:23

difference anyway as we go through this

184:25

lesson I'm going to be calling out when

184:27

you should use certain charts as best

184:30

practice along with my recommended tips

184:32

for how to customize it to show them

184:38

best so for our first chart as I hinted

184:40

to we're going to be making this job

184:43

posting count into a line chart and this

184:47

is the chart I'd use typically for any

184:49

time series like data as it's great at

184:52

showing a trend over time and how it's

184:54

connected so how do we do this well

184:56

we're going to select all the data here

184:58

all the way from A1 down to B13 come up

185:02

into insert and we're going to dive into

185:05

each one of these charts individually

185:06

but I would encourage you to actually

185:08

just start with recommended charts I

185:10

really jump to it every time I use it

185:12

anyway first thing they has two tabs

185:14

here recommended charts and all charts

185:15

for recommended charts usually provides

185:19

a lot of good tips that you could

185:21

potentially use for different charts

185:24

sometimes however I do find that I want

185:26

a particular chart and it's not here and

185:27

that's when I'm going to go to this all

185:29

charts Tab and frankly it provides a lot

185:31

more control while allowing you to

185:34

actually visualize your different data

185:36

in our case I know I want a line chart

185:38

on this but now I can go in and actually

185:41

plot it with markers or even change it

185:44

into a 3D line chart highly don't

185:46

recommend this we're going to be

185:47

sticking to a line chart for this and

185:49

I'm going to go ahead and click okay I'm

185:51

not going to lie this chart is getting

185:54

us 90% of the way there now if you

185:57

notice for this when we clicked on the

185:58

chart we have certain values highlighted

186:02

here basically this purple outline is

186:04

showing that this is the X values right

186:07

here and then the blue coordinates right

186:09

here are showing the actual values

186:11

themselves and then conveniently they

186:13

put the job posting count which is

186:15

highlighted in Orange as the title we'll

186:18

be jumping into how to customize this

186:20

area in the advanced section but that's

186:23

in the next lesson now for those new to

186:25

charts there's a bunch of different

186:27

elements and I can come up here and I

186:28

can click this plus icon right here and

186:30

it shows all the different elements on

186:33

here I can use the checkbox to control

186:35

whether I want to include the axes or

186:38

not in this case I do want to include it

186:40

and then I can even find tune it further

186:42

to select which one I'm talking about am

186:44

I talking about the horizontal or am I

186:46

talking about the vertical

186:48

just going through these in Rapid

186:49

fashion access titles allow us to

186:51

provide titles for the X and Y AIS the

186:53

chart title shown above I can remove it

186:56

or keep it on if I want to include data

186:58

labels I can do this along with

187:00

controlling what position of them I want

187:02

to go with I could also include

187:04

something like a data table below but

187:06

personally I find this is sometimes

187:07

sensory overload I don't really use that

187:09

much next are airb bars for data grid

187:12

lines whether I want to have horizontal

187:15

vertical some minor ones or some other

187:17

minor ones a legend if there's more than

187:20

one data I probably want this a trend

187:22

line which will be adding in this a

187:24

little bit and then up and down bars

187:27

which are going to show whether the data

187:28

goes up or down based on each set but

187:30

not really necessarily applicable to

187:31

this one now I find this plus icon is

187:34

where I go most of the time but I could

187:36

also go to this chart design tab up here

187:40

and it has this box of add chart

187:41

elements and basically you can go

187:43

through and adjust all the different

187:45

ones along with showing a more visual

187:48

indication of what's going on here here

187:51

showing that I was actual up down bars

187:53

to actually see what they actually look

187:55

like you can also use this quick layouts

187:57

to quickly try out different themes that

188:01

Excel has so doy myself from time to

188:03

time using this so this chart is almost

188:06

done all I do want to do first is change

188:08

the title and I usually like to either

188:11

provide some sort of snippet of

188:13

information from it or ask a question

188:16

that I want the reader of this graph to

188:19

understand or take away from this chart

188:22

so I can put in something like how did

188:24

jobs Trend in 2023 so it also tells what

188:28

year what's going on here and it asks

188:30

them to look at hey what is the trend

188:32

going on here which it looks like we

188:33

have a peak up in January and a peek up

188:35

in August now I try to minimize the

188:38

amount of access titles on here because

188:39

like in the month's case that's pretty

188:41

self-explanatory however the number in

188:43

the y- AIS is not so self-explanatory so

188:47

in that case I would want to include it

188:50

in this case give it a representative

188:51

name of counts of jobs the last thing I

188:54

want to do with this is just add a trend

188:56

line and there's multiple different

188:57

options for this we can do linear

188:59

exponential a linear forecast where it

189:01

actually goes into the future and then

189:03

even a two period moving average which

189:06

is pretty neat I'm going to just stick

189:08

with the basic one right now of linear

189:10

and Bam that's our first chart so let's

189:12

move in the next

189:15

one now if we go back to our original

189:17

data set in the data tab we have a

189:20

column here on job no degree mention and

189:24

basically this column right here

189:27

includes whether there's a mention of a

189:30

degree in a job posting so in this case

189:34

where we have two different values we're

189:37

trying to determine what are the

189:39

proportions of each a way to compare

189:42

this we could either compare this in

189:44

like a bar or column chart but I feel a

189:46

better one for this is a pie chart so

189:48

I've gone through and calculated a count

189:51

of the jobs with a no degree mention

189:54

along with those that have a mention of

189:57

a degree I calculated the total and then

190:00

from that I calculated their individual

190:03

percentages now I'm not going to just

190:05

select all the data here because I don't

190:07

want to plot all of it I'm going to

190:08

select the first two values here of A2

190:11

A3 press control and then also select C2

190:14

to C3 then from here now I'm I'm going

190:17

to go insert those recommended charts

190:20

like got a lad two bar and column charts

190:23

come up but the one we're going to be

190:24

using for this it's a pie chart so I'm

190:26

going to go ahead and insert that in now

190:28

personally I'm not a fan of this layout

190:31

here so I'm going to come up into chart

190:32

designs into Quick layouts and I'm going

190:34

to just experiment with different ones

190:37

looking at them and frankly I like the

190:39

one this one right here actually where

190:42

we've removed the legend and put the

190:44

actual values themselves along with

190:46

their titles inside the pie chart itself

190:49

to make it super simple to see which one

190:51

is which now Excel sometimes gets crazy

190:55

with the colors I actually don't

190:57

recommend using a lot of different

190:59

colors because it could be very

191:00

confusing for viewers on where to look

191:03

personally I want to highlight more of

191:05

the no degree mentioned so I'm going to

191:09

use this single color palette right here

191:12

or this monochromatic color palette

191:14

right here that has these different

191:15

shades of blue and and I feel the ey is

191:18

going to go more to the darker blue now

191:20

with each of these labels here I can

191:23

actually select it I double clicked it

191:25

over time I can actually drag it and

191:27

drop it and move it around where I want

191:29

it to be I would probably want it to be

191:30

more over here I want the degree

191:32

mentioned to be stacked basically I want

191:35

them opposite of each other now you may

191:37

have noticed I can't really read this

191:40

text right here and even this text is

191:42

hard to read as well so what I can do is

191:45

I'll just click outside real quick and

191:47

clicking back in I'm going to double

191:49

click and this is going to bring up the

191:51

format data labels if double clicking

191:54

isn't work you can just select it go

191:56

into the format tab up here and select

191:59

format selection anyway there's a lot to

192:02

unpack in this Pane and we'll be

192:03

unpacking it as we go along this entire

192:05

chapter but the main thing to understand

192:07

is they have label options and text

192:09

options we want to adjust the text

192:12

options and this has things like text

192:13

fill and outline text effects and then

192:16

also the text box for this we're trying

192:18

to fill the text fill and outline

192:21

specifically this drop down here of text

192:24

fill we want to change the color so we

192:27

want to change it to White now if you

192:29

notice only one of these change and

192:31

that's because I only had one of the

192:33

boxes selected so actually actually

192:36

click out of this double click back into

192:38

this and then make sure both of these

192:40

are actually selected go back into text

192:42

options go into text fill and then

192:44

change this color and then it's going to

192:45

change both of these colors

192:47

now I'm fine with this text now but

192:49

let's say I wanted to customize further

192:51

the percentage here maybe I want to

192:53

include one more decimal place clicking

192:55

on the box itself I can now have this

192:58

option for label options and then under

193:01

well label options again I can scroll

193:03

all the way down or I can actually cover

193:05

this up and then unhide this number I

193:08

can change the number formatting itself

193:10

in this case I do want to still do a

193:12

percentage and then maybe I want to do

193:14

one decimal place personally I think

193:16

there's a little a little bit too much

193:17

dat so we're just going to keep it with

193:18

the zero all right that's the final

193:20

customization the last thing we want to

193:22

do is just add a title and I want a very

193:25

compelling title what do they want to

193:27

look at for this I want them to

193:28

understand what jobs mention a degree

193:33

and now with this we have a pretty great

193:36

visual indication of that about one of

193:40

jobs have no degree mention in them

193:44

which personally I think that's a pretty

193:45

high percentage and hopefully gets

193:50

higher so we have data similar to our

193:53

first chart that basically explains how

193:55

many counts of jobs for the different

193:58

job titles now this isn't chronological

194:01

so I don't necessarily recommend using

194:04

something like a line chart for this

194:05

that's why we're going to be making

194:07

column and bar charts for this also let

194:09

explain the difference between the two

194:10

anyway I'm using the formulas that we

194:12

previously have covered you can dive

194:14

into it if you want to basically using

194:15

unique and then also a c if formula in

194:19

order to count each one of these in

194:21

their data tab anyway if I actually go

194:23

to graph these by selecting all these

194:26

things go to insert and recommended

194:28

charts here provides the recommended

194:30

charts and we're going to start with a

194:32

column chart first I start with this one

194:35

first because we're already running into

194:37

problems with how long these labels are

194:41

we can see that we have these three

194:42

ellipses here basically telling the that

194:45

the rest of the name is hidden here so

194:47

not all the names are shown here the

194:50

other problem that we're getting into

194:52

with this column chart um named after

194:55

the fact that it looks like columns is

194:58

that it's not in an organized manner I

195:00

would expect to see it high to low to

195:02

make it more easily to compare values to

195:05

each other and also how they rank so

195:07

we'll go ahead and delete this bad boy

195:09

anyway this table is organized based on

195:12

this unique function which doesn't

195:15

necessarily put things in the correct

195:17

order and I won't be able to actually go

195:18

through and filter it or soter it

195:21

appropriately So Below this I made a

195:25

different table that I basically use

195:27

sort to sort these values from above by

195:32

their job count in descending order now

195:36

since it's in this order I could

195:37

actually select a few less of this

195:39

remember how it was cut off last time I

195:41

could select only the top six go into

195:44

here go into recommended charts and once

195:46

again and put in our clustered column

195:49

chart now this one I can play around

195:52

with and as you see as I expand it out I

195:55

can actually see all the different names

195:58

here but once again I'm not a fan of

196:00

this column chart I'm not going to be

196:02

using it for this case instead we're

196:03

going to try out a bar chart instead so

196:06

selecting all this data to show the

196:08

power of these bar charts and then

196:11

coming in I can put in that bar chart

196:13

now I do like this one better because

196:15

all the titles are organ ganized and

196:18

they're right off to the side and so

196:19

this is a much more easier read the

196:22

problem now is I'm really nitpicky with

196:24

my charts the problem now is I don't

196:26

like the order that this is in what

196:28

happens is is Excel starts plotting

196:31

these although it's in descending order

196:33

in our table as shown over here it's

196:36

going to be plotting them starting at

196:38

this zero axis up here and then plotting

196:40

from there so technically we don't even

196:42

want it like this instead what I can do

196:46

is reverse the sort order here I'm just

196:49

controlling it by using uh either one or

196:52

netive one in that sort order portion

196:55

anyway with this order now now we can

196:58

finally get into the final bar chart

197:00

that we want to actually put in and I'm

197:01

just going to skip this recommended

197:02

charts come up here into the column and

197:04

then the bar charts we want this one

197:06

inserted in and I'm also going to zoom

197:08

out some now this is more in lined with

197:12

what I want let's actually clean up this

197:14

visualization to identify what we want

197:17

I'm actually more curious about what are

197:19

the top jobs in data science so that's

197:22

what we'll name it additionally feel the

197:24

titles are pretty self-explanatory based

197:27

on that title but I would need something

197:29

for the x-axis down here so we'll add an

197:31

axis title calling this count of job

197:34

postings now with this question I'm

197:36

asking of what are the top jobs in data

197:39

science I'm not really feeling like we

197:41

need to include things like machine

197:44

learning Engineers software Engineers

197:45

cloudware Engineers or business business

197:47

analyst how could I actually adjust this

197:50

well one way is I could control what

197:53

areas are highlighted over here and I

197:55

could actually drag this and change this

197:58

to whichever ones I want um but I'm not

198:01

necessarily going to recommend that

198:02

instead I'm going to select our data

198:03

make sure all the columns are selected

198:05

themselves rightclick it and then go to

198:08

select data and this new window is going

198:11

to pop up here this tells us a lot of

198:14

great things about our visualization

198:16

first is the chart data range it tells

198:18

us we're selected from a25 to b35 so we

198:21

could change that here if we wanted to

198:23

the next thing is the two windows down

198:26

here of the legend entries and the

198:28

horizontal axis so this controls our job

198:31

count I'm going to scroll this over here

198:33

we could just remove job count but it's

198:35

not going to do anything this guys

198:36

mainly right here the access labels we

198:38

can control so I know I want data

198:40

analyst and all the way up down to

198:42

senior data analyst I can actually go

198:44

through and select remove business

198:46

analyst machine learning engineer

198:49

software engineer and Cloud engineer and

198:52

then click okay and it will remove it

198:54

from this visualization while still

198:56

keeping this data here so I can easily

198:58

go back and add or remove job titles as

199:01

necessary and now we have our final

199:04

visualization earlier I did go through

199:06

and actually delete the chart and start

199:09

over but you do have this option in the

199:11

chart design tab of change chart type

199:14

and allows you to basically go through

199:16

and try out different ones if I wanted

199:19

to go back to that column chart I could

199:22

and it would show me an example of what

199:23

it looks like now there is one last

199:26

thing that I want to format on this I do

199:28

find it a little difficult to read

199:29

exactly what are the amount of job

199:32

postings that they have here so I'm

199:34

going to add data labels to this we have

199:37

a couple different options we can be

199:39

inside end which can't read at all

199:41

inside Base outside end which I'm more

199:43

for and then also a data call that's

199:45

just too much there we're going to do

199:46

outside end now with this these numbers

199:49

I don't like the level of detail I don't

199:51

need down to the single or the on

199:54

digigit place to tell what it is instead

199:56

I would rather it shows something like

199:59

9.6k or 9.6000 so we can actually format

200:03

that so double clicking on one of those

200:05

labels this format short area is going

200:07

to pop up again and for this I'm going

200:09

to go under label options and then label

200:12

options again and finally number and for

200:15

this I'm going to use use instead of uh

200:18

any one of these I'm going to use a

200:20

custom type now I have a few of these

200:22

already built into here and so they may

200:25

not pop up to you but this is actually

200:27

sneak peek this is actually what we want

200:29

but if you don't have this popping up

200:30

right now what you can do is actually go

200:33

in in this case I'll just show a

200:34

different value what we're going to

200:35

first say is how we want this formated

200:38

with how many decimal places so I want

200:40

all the values before the decimal place

200:42

then a decimal place and then I only

200:44

want in this case let's go with two

200:46

places after the decimal place and then

200:48

from there I want a K on the end so

200:51

basically to show this as a thousand so

200:53

I'm going to use a parenthesis put a k

200:56

and then close parenthesis and I'm going

200:58

to click add okay so now this changes it

201:00

to the double digits for explaining that

201:04

this is the thousands this automatically

201:07

whenever I do that K parentheses it

201:09

automatically does the math to basically

201:12

divide that by a th and transfer this to

201:15

K instead of the thousands anyway I

201:18

don't really I'm going to go with the

201:19

original one I had of only one decimal

201:21

place and Bam that's our final

201:23

visualization and we can see from this

201:26

that we have a lot of insights into

201:27

understanding that more Junior roles

201:29

like dat analyst dat scientist dat

201:31

Engineers are more prevalent than the

201:32

senior roles and that luckily it seems

201:35

like there's a lot more data analyst

201:36

roles than data scientists and data

201:38

Engineers all right you now some

201:39

practice problems to go through and get

201:41

more familiar with those four major type

201:44

of visualizations that frankly I feel

201:47

I'm using on a daily basis anytime I'm

201:49

making visualizations so don't think

201:51

that they're just too plain or too

201:52

simple they're really powerful and

201:54

explaining data in the next lesson we're

201:57

going to be jumping into not only more

201:58

advanced charts but even more advanced

202:01

customization so with that I'll see you

202:03

in that

202:07

one we're going to crank this up a notch

202:09

and get into some more advanced

202:12

visualizations specifically on this

202:14

we're going to be doing a deeper dive

202:16

dive into the pay of different jobs not

202:20

only based on the different job titles

202:23

but also based on where a job is located

202:27

using things like a map chart and so for

202:30

all these charts also we're going to be

202:31

looking into how we can further get into

202:34

deeper customization of

202:39

these so Scatter Plots are great at

202:42

comparing two numerical values in our

202:45

data set we have these two columns here

202:49

one on the salary year average and the

202:51

other on the salary hour average just as

202:54

a background on why it's called average

202:56

at the end of these sometimes job

202:58

postings have a range of salary and so I

203:01

took the average of the Min and Max and

203:04

hence I named this average anyway we

203:07

have yearly salary data and we have

203:09

hourly salary data what it did next is

203:12

get the unique value of the job titles

203:14

and then from there using that median

203:17

basically modified median IF function

203:19

got the yearly median salaries and then

203:22

the hourly median salaries so because we

203:25

have these two numerical values to

203:28

compare basically we want to see if

203:30

there's a trend correlated between the

203:32

two because well there is we're going to

203:34

find out I'm going to go ahead and

203:36

select these all then from there go into

203:38

insert and we can come into charts I

203:41

know I want a scatter plot and if we go

203:43

to insert it in can't see cuz it's

203:46

hidden behind here well we'll just go

203:48

ahead and show it this isn't necessarily

203:50

showing us what I want us to show with

203:54

this it's basically showing hey this is

203:56

the yearly data up here in the blue and

203:58

then this is the hourly data since

204:00

hourly data it's super low it didn't

204:02

work out how I wanted to by selecting

204:04

all the data like we've previously been

204:06

doing instead I'm going to go ahead and

204:07

delete this what we're going to do is

204:10

we're only going to select basically

204:12

this B and C column of data once again

204:15

we're going to try again inserting that

204:17

scatter plot and at this point it's

204:20

actually working correctly as we want it

204:22

unfortunately we can't tell there's no

204:25

basically like data labels for this to

204:28

understand what are the different job

204:29

titles associated with it even with the

204:31

graph we can see that it's only

204:32

highlighting this also the incorrect

204:34

titles up here it's not just hourly

204:36

median salary we're going to fix all

204:38

this anyway the first thing that I want

204:40

to clean up is actually the selection of

204:42

data right now we can see these numbers

204:43

are overlapping down here also it goes

204:45

all the way down to this zero axis on

204:48

both the X and Y I want to change that

204:52

so I'm going to double click this x axis

204:54

and format access pane pops up and we

204:57

can see that we have bounds here 0 to

204:59

180,000 I can see that there's no values

205:01

under about 75,000 so I'm going to go

205:04

ahead and put that in for the minimum

205:06

and press enter so it's going to jumate

205:08

this way now I want to do the same thing

205:11

for the Y AIS I'll just double click it

205:14

and this one didn't necessarily go where

205:16

I wanted it to go I wanted to actually

205:18

change the values here so we can go

205:20

under access options under access

205:23

options again and under access options

205:25

again we can change this minimum maximum

205:28

I'm going to change it to looks like

205:30

there's nothing above 20 or below 25 so

205:33

we're going to go with that now even

205:35

with this change in the formatting of

205:37

the values here the minimum I can still

205:40

see that there's overlap here so I want

205:43

to update this similar to last time

205:45

basically cut it off the thousands place

205:47

and place and put a k at the end so

205:50

under access options access options

205:52

again I'm going to close this drop down

205:54

of access options also instead we're

205:56

going to go to number for this we want a

205:58

custom type and I do have some values in

206:02

here but we're just going to go if you

206:04

don't have them in here we're going to

206:05

add a new one specifically with this I

206:08

wanted to show one I wanted to show a

206:11

dollar sign at the front and I don't

206:12

want any decimal places whatsoever so

206:15

I'm just going to put a zero in there

206:17

and then from there like last time I

206:19

want to format this in the thousand's

206:21

place so I'm going to put a comma and

206:24

then double quotes to put around the K

206:28

which signifies I want to formulate this

206:29

in the thousand's place I'm going go

206:31

ahead and click add and now this is much

206:34

more readable not so much sensory

206:37

overload for our y AIS I don't care at

206:40

all about this decimal place right here

206:42

so going back into numbers again I can

206:45

just format the decimal place places as

206:47

zero and I'll just leave this one as an

206:49

accounting category now which one's

206:51

yearly and which one hourly salary well

206:53

we need to include actual access titles

206:57

for this so I'll go ahead and enable

206:59

that and then for this we're going to do

207:01

something a little bit different I'm

207:03

going to select this ya AIS title and

207:07

instead of actually typing in values in

207:09

I want to use actually the column header

207:12

right here so I'm going to come up into

207:13

the formula bar type equal to I'm going

207:16

select C1 and then press enter and now

207:21

this updates for that column head I can

207:22

do the same thing here for the x-axis

207:25

title selecting it then from there going

207:27

to the formula bar put an equal and

207:29

selecting cell B1 and pressing enter for

207:32

the title we don't want that hourly

207:33

median salary we're really trying to

207:35

find out what jobs have the highest pay

207:38

and we can basically tell it from this

207:39

all right so let's actually finally get

207:41

to adding data labels to this and we can

207:43

see what data labels are actually

207:44

available but scrolling over the

207:46

different options here we're going to

207:48

just go with above for the time being

207:50

then I'm going to close on out of this

207:52

and I'm going to select the data labels

207:54

themselves and format data labels should

207:56

pop up if it doesn't you can also go

207:58

about doing it by right-clicking this

208:01

and going to format data labels anyway

208:03

for this I don't want to actually show

208:06

The X or the Y value for this anyway uh

208:10

I made I made it disappear by actually

208:13

closing out of that so actually I going

208:14

have to add those data labels again

208:16

again anyway going back into it under

208:17

label options label options then label

208:19

options again I'm going to leave that y

208:21

value selected for right now but what I

208:23

want to do now is provide the job title

208:26

itself right next to the data point so I

208:30

can do this option here so label

208:31

contains value from cells and it's going

208:35

to ask me to select the data label range

208:38

and so now this is when I'm going to

208:40

select all of these different job titles

208:43

here and press okay so now we have these

208:46

values from cells I no longer want this

208:48

y values and I do want to include this

208:51

leader lines because we're going to be

208:52

actually dragging this around because as

208:54

you can see some of these values are

208:56

overlapping now also I'm noticing that

208:59

this is really busy right now with all

209:00

this text and stuff so I'm actually

209:02

going to remove the grid lines for the

209:04

time being actually for the remainder of

209:05

this cuz I I don't feel like it really

209:07

needs the grid lines in general and now

209:09

I have a little bit less sensory

209:10

overload so I can go through and

209:13

actually clean up where a lot of these

209:15

different job titles are located by just

209:17

selecting it and then dragging it and

209:20

you notice uh we had that leader line

209:21

selected so I have arrows or basically

209:24

lines going to each of these ones to

209:27

signify which one is which so now I've

209:30

dragging these all over so that way

209:31

they're basically more represent I want

209:34

sometimes if I dragged off of this and

209:36

drag maybe the whole chart itself and

209:37

make a mistake I press just control Z

209:40

and it reverts it back to where I'm

209:41

going and then I just continue on to

209:43

selecting the box that I want and moving

209:45

it anyway this is pretty neat now I

209:47

could actually go in if I wanted to and

209:50

add a trend line to this and basically

209:53

it shows for an increase in that yearly

209:55

salary I expect the same with the hourly

209:57

data in this case I don't find it as

209:59

much useful so I'm going to just keep

210:02

leave that off but in general it is

210:04

pretty neat to see the trend that's

210:06

going on with this that senior data

210:08

Engineers although they're underpaid

210:10

compared to senior data scientist in

210:13

yearly salary you could get the hookup

210:15

if in instead you look for an hourly gig

210:17

instead in order to get a little bit

210:19

higher pay a similar Dynamic happens

210:22

between business analysts and data

210:24

analyst so if you're a data analyst and

210:26

you're looking for a job maybe on upwork

210:28

maybe you should advertise as a business

210:29

analyst

210:32

instead all right going back to our data

210:34

set itself we have another column in

210:37

here I want to investigate and that's

210:39

specifically around the country is

210:41

called job country basically where the

210:44

job is located at and I like to

210:46

visualize these type of things well on a

210:48

map to actually see how it affects

210:50

others so I've made this table here

210:53

under the map chart tab where we have

210:55

our all the different countries in the

210:57

data set then from there we use a count

210:59

if to determine how many counts for each

211:01

of the countries and then our modified

211:03

median if in order to determine what the

211:06

median salary is in each of these

211:08

countries I've also had to wrap this one

211:10

in an if error because some of these if

211:11

there's no values it throws an error and

211:14

I didn't want that popping up in the

211:15

chart so so I had it disappear or make

211:17

it basically a blank value if it does

211:19

have an error anyway let's get into

211:21

visualizing this we're going to first

211:23

just visualize what are the counts of

211:26

these different jobs based on the

211:28

country so I'm going to select column A

211:30

and B go to insert and then maps and go

211:34

to this map chart now you may have a

211:37

pop-up warning that comes up during this

211:39

that says data needed to create your map

211:41

chart will be set to B and I'm fine with

211:45

sending this data to being you should be

211:46

fine too with it so feel free to accept

211:48

this then you shouldn't get this pop up

211:50

anymore anyway this chart's pretty neat

211:53

because it goes and shows we have a

211:55

heavy concentration of jobs basically

211:58

from the United States for my job

212:00

scraper I'm heavily aggregating jobs

212:03

from this country compared to other

212:04

countries sorry other countries out

212:06

there but I am still n less collecting

212:08

from other countries like us has 25,000

212:10

India is around 580 for this one I'm

212:13

going to change the title to where are

212:16

most jobs in Luke's data set from

212:17

there's not to say the United States has

212:19

more jobs than other countries this is

212:20

just how my data set is and how I

212:22

extracted the data so don't want you to

212:24

come up with the wrong conclusions from

212:26

this now the visualization that I really

212:28

care about is comparing these countries

212:30

to the median salary so holding control

212:32

I select a and then C I'm going to do

212:35

recommended charge from this cuz I'm

212:36

having problems using the maps one

212:39

anyway I see that it has the filled map

212:40

here I'm going to select okay and I have

212:43

all the data filled in all right with

212:44

this visualization we we can now dive in

212:47

we can see that we have a range of these

212:49

median salaries from over 157,000 down

212:52

to 30,000 with country like China having

212:55

around 68,000 and then over in Africa we

212:58

have Algeria at

213:00

45,000 so looks like we have a lower

213:02

salary in the African continent over in

213:05

North America and also South America

213:07

pretty high salaries along with

213:09

Australia as well anyway pretty cool

213:11

visualization we were able to generate

213:13

out of this I mean I love data and I

213:15

just love this visual a with this I'm

213:16

going to change the title to what are

213:20

top paying countries now the last thing

213:23

is a minor Point sometimes if you're

213:24

going ahead and actually moving maybe

213:27

columns around you'll notice that my

213:29

visualization is also moving as well and

213:32

this can wreak havoc especially whenever

213:35

you've made your dash or made your chart

213:37

a certain size and then move columns

213:38

around and it messes everything up we

213:40

can fix this so I'm going to go ahead

213:42

and contrl Z both of those column moves

213:44

to get it back to where I had previously

213:47

and then from there I'm just going to

213:48

double click on the chart itself go

213:49

under chart options and once again this

213:52

like resizing one here and going under

213:55

properties right now it's selected under

213:57

move and size with cells we don't want

214:00

to do that basically we don't want to

214:02

move or size with the cells so I'm going

214:05

to select that now closing out of this

214:07

whenever I go to adjust the column size

214:09

it's not going to adjust the

214:10

visualization at all this is much more

214:13

of what I want also one last note on

214:15

this I do do have a filter currently

214:17

applied to this data set specifically I

214:19

go into it it's a custom filter and I

214:22

wanted to make sure that I had basically

214:25

removed any na values so I put hey I

214:28

want values that are median Sal greater

214:30

than zero and are less than 200,000 so

214:34

if I go ahead and clear this filter we

214:37

can see that we have some other values

214:39

up here basically rushes up here at

214:41

300,000 for a median salary and if we

214:44

actually go in investigate Russia we'll

214:47

see that they only have around four jobs

214:49

with salary data listed so I feel like

214:52

this salary is more of an outlier than

214:55

anything so that's why I'm applying this

214:57

filter of 0 to 200,000 applying this

215:00

filter again we get final visualization

215:03

now you could also play around with this

215:05

and filter it based on the number of

215:07

counts to make sure you have values that

215:09

are above a certain count that's also an

215:11

option and probably maybe even a better

215:13

option as well all right chch turn now

215:16

to dive into those practice problems to

215:18

try out some different Advanced

215:19

visualizations and along with some

215:21

Advanced customization with that in the

215:24

next lesson we're going to be diving

215:26

deeper into understanding how to use

215:28

statistical analysis specifically box

215:30

and wher charts and also histograms and

215:32

how to read them with that see you in

215:34

the next

215:38

one this lesson is going to be focused

215:40

on actually visualizing a lot of the

215:43

things that or a lot of the functions

215:44

that we used in that statistical

215:47

functions lesson where we're looking

215:49

visually at things like the median and

215:52

core tiles specifically we're going to

215:54

do a refresher on histograms we've seen

215:56

it a few time reality but we're going to

215:58

dive into further understanding how

216:02

salaries are distributed specifically

216:04

for a target audience of data analyst in

216:06

the United States you can feel feel free

216:08

to do whoever you want and then from

216:10

there based on the limitations of it

216:12

only be able to visualize one job title

216:15

we're going to shift Vex to looking at

216:17

box and whisker charts and these are

216:20

great at also showing statistical

216:22

distributions like a histogram but we

216:24

can take it a step further and we

216:26

compare different values specifically in

216:29

this case we're going to compare them

216:31

across the different job titles on how

216:32

they're distributed now box and whisker

216:35

charts aren't probably a chart that

216:37

you're familiar with or most people are

216:38

familiar with so we're going to go

216:40

through a review and understand and

216:42

break them down to understand those

216:44

Concepts we talked about previously

216:46

about median and quartiles and where

216:47

they fall into this for this we're going

216:50

to be using the charts statistics

216:53

workbook specifically we're going to be

216:54

starting in this data Tab and for all

216:57

this we're going to be analyzing salary

216:59

data in this video we're going to be

217:00

focusing specifically though on that

217:02

yearly salary

217:06

data so let's actually go back into

217:09

breaking down how to read a histogram we

217:11

go back into insert recommended charts

217:14

and then from there select histogram and

217:16

insert in the histogram I don't like

217:19

where it is right now I'm actually going

217:20

to move this chart into a new sheet now

217:25

quick refresher on histograms each one

217:27

of these bars represents a count of

217:29

values within a range so in this case

217:33

there's 920 values between the range of

217:37

oh my gosh so hard to read 75,000 to

217:41

81,000 and as we're noting by this we

217:43

have a large number over here if gets

217:45

even out to 960,000 this would be called

217:48

a skewed right distribution now this is

217:52

different from a column chart because

217:55

this data down here on the xaxis is

217:58

basically continuous data when one bin

218:01

stops so this first bin of 15,000 to

218:04

21,000 the next bin picks up now the

218:07

first problem with this histogram is

218:10

this is for all salary data specifically

218:12

all job titles across all countries I

218:15

want to actually find tune to look at my

218:18

specific use case of data analyst in the

218:20

United States so you can come here into

218:22

the histogram 2 Tab and I have the four

218:26

Columns of interest that I want to use

218:28

from the data Tab and I already have the

218:30

filters applied but if you want to you

218:32

can come in here and actually select to

218:34

clear these filters and I'll just select

218:36

it here from that Home tab then from

218:39

there I'm going to go through and select

218:41

data analyst roles that are full-time

218:44

only that are in the United States and

218:48

then finally I don't want any of these

218:49

blank values here so I'm going to

218:52

uncheck this value here for blanks now

218:54

we'll say filtering this data did take

218:57

some time to actually do so don't be

218:59

alarmed if this taken more than 10 or 15

219:01

seconds all right so back in let's

219:03

actually make a histogram with this data

219:06

we'll go into insert from here I'm going

219:08

to insert in a histogram now once again

219:10

this distribution is so the last one

219:13

skewed right and we have a heavy amount

219:14

of outline s right here even out this

219:17

one value around 370,000 I don't think

219:20

this provides a lot of value instead I

219:22

want to actually focus more into these

219:25

this actual distribution and not

219:27

actually on this portion out here that

219:28

we have just outliers anyway I'm going

219:30

to come in here into our filters up here

219:32

insert a number filter and that it's

219:35

less than 300,000 click okay all right

219:39

this is looking a lot more readable

219:42

which we can actually see now the x-axis

219:45

now each one of these bars right here or

219:48

what what you would see in like a column

219:49

chart are called the bins and they're

219:52

all equally space but we can control the

219:55

width of each one of those bins that

219:57

they Encompass specifically I can double

219:59

click on the chart to bring up that pane

220:01

to the right selecting the x axis I can

220:04

then go into access options and then

220:07

once again access options we can go into

220:09

something right now we're noticing that

220:10

the bins are automatically determined we

220:13

can actually change this binwidth I'm

220:15

going to change this something to like

220:17

15,000 notice that it is bigger in this

220:20

case the bins are bigger than they were

220:21

previously you can feel free to test

220:24

different options if you will I feel if

220:26

you go too small in the case let's say

220:28

we went down to 1,000 it just gets too

220:30

noisy and also you can't necessarily see

220:32

the distribution as well so really you

220:35

just have to play around with it until

220:36

you get to what you want to find as far

220:38

as the access goes this is a little bit

220:41

this is sensory overload for me way too

220:43

many zeros in here so I'm going to move

220:45

this selecting the xaxis we can see that

220:48

has format access now I can go under

220:50

number and once again we can go in our

220:52

custom type none of the ones that I've

220:54

previously done are here sometimes it

220:56

pops up sometimes it doesn't we're going

220:58

to go ahead and just put in we want the

221:00

dollar sign zero and then formatted with

221:03

the K value basically removing all those

221:06

uh thousands zeros and I'm going to go

221:08

ahead and click add all right this is a

221:11

lot more readable to actually see what

221:14

those different ranges are

221:15

and from there I'm going to change the

221:17

title of how much do data analysts in

221:20

the United States make probably also

221:22

best practice here to add a title on the

221:26

Y AIS for count of jobs and B now we

221:30

have this final visualization show on

221:31

our histogram we can see that a lot of

221:33

the salaries are more around the range

221:36

of 85,000 to 100,000 which 70,000 85,000

221:41

is coming up next so this show is really

221:43

visually great and at where I can expect

221:47

to have a salary as a starting data

221:52

analyst but now what if we want to

221:55

analyze multiple different job titles

221:58

which we're going eventually get to is

222:00

this box plot here where we're plotting

222:02

it for all the different job tiles we'll

222:04

be able to actually compare different

222:06

values across each other but before we

222:08

get to that we need to First understand

222:10

how to read a box plot also sometimes I

222:12

call it a box plot but it's also known

222:14

as a box and whiskers chart anyway I

222:17

made this visualization here you don't

222:18

have to do it there's a bunch of

222:19

customization along with it the main

222:21

purpose of this is to demonstrate or

222:23

help understand how to read a box and

222:27

whiskers chart so I took our data that

222:30

we previously were analyzing for data

222:32

analyst in the United States it was a

222:34

full-time role along with all the salary

222:35

data and then I use like we previously

222:38

did calculating things like the Min

222:41

first quartile median average third

222:44

quartile and Max just ignore this

222:46

portion right here it was used to make

222:48

build this visualization right here

222:50

anyway I tried as best as possible to

222:52

line up this histogram where we have the

222:54

x-axis going from 25,000 to 285,000 with

222:59

the box and whiskers chart I may below

223:01

it from 25,000 to 285,000 so the Box

223:04

itself signifies what that nerds call

223:07

the inter quartile range basically all

223:10

the values between q1 or quartile 1 and

223:14

cortile 3 had a typo there got to fix

223:16

that anyway that's why it was so

223:17

important that previously we calculated

223:19

that first quartile and third quartile

223:21

and if you remember from that there

223:23

quartiles so 50% of the data Falls

223:26

within this box and if we look up we

223:28

were to draw imaginary lines into our

223:30

histogram we can see that about 50% of

223:33

the data does fall within this the next

223:36

up inside of here is a line that is for

223:39

the median in this case our median is

223:41

990,000 and then we have our average of

223:44

90

223:45

5,000 which as we discussed previously

223:48

the average is going to be higher here

223:50

because we have things all the way out

223:52

here called outliers basically dragging

223:55

that average higher and outliers are

223:57

signified by these dots outside of the

224:01

whiskers themselves these whiskers are

224:03

the lines and the lines themselves

224:05

extend to the minimum and the maximum

224:09

and these are just relative mins and

224:11

Maxes they're not necessarily the true

224:14

men and Max anyway so that's a box and

224:17

whisker chart and frankly by themselves

224:20

I don't think they're really great but

224:22

when you pair them with other

224:24

categorical values I find them super

224:27

interesting so let's actually build this

224:29

visualization so you can come over to

224:31

this box plot2 Tab and I have our data

224:35

inside of it none of it is filtered it

224:38

has all the different job titles and all

224:40

their Associated salaries for this I'm

224:42

going to select column M and then also

224:44

holding control I'm going to select

224:46

column A then from there go in and

224:48

insert and go to recommended and from

224:50

there look at the box and whiskers chart

224:53

which looks like it's already pulling it

224:55

up for us so let's pop this bad boy in

224:57

now one drawback of these box and

224:59

whisker charts in Excel is unlike that

225:02

last box plot that I made I custom made

225:05

this in order to make it appear in this

225:07

horizontal fashion you can actually do

225:10

that you can only have the option to

225:11

have them vertical up and down anyway

225:14

this is pretty close of what we want to

225:16

get the main problem I'm noticing right

225:18

now is we have outliers up to 1.2

225:22

million and it's really with the data

225:24

around 100 150,000 it's really hard to

225:28

actually look into those boxes so I'm

225:30

going to change this yvalue scale double

225:33

clicking on the Y AIS I'm going to

225:36

change the maximum to 300,000

225:39

additionally since we're here I'm going

225:40

to change that number formatting to use

225:43

that 0k value then also I'm finding the

225:46

color is a little hard to actually see

225:49

these x's in here so under series option

225:53

selecting fill in line fill I'm going to

225:56

change this color to more of a lighter

225:59

blue okay and that's definitely easier

226:02

to read I'm going to add a vertical

226:04

access of salary USD I'm also going to

226:07

bold it all to make it a little bit more

226:08

readable and then from there change that

226:10

chart title to what are the top paying

226:13

jobs in data science all right getting

226:15

into actually analyzing this and getting

226:18

insights from it now one drawback out of

226:21

this is there's not an easy way to sort

226:25

these values right here right now I'd

226:27

normally put them high to low I'd

226:28

probably put them high to low based on

226:30

median salary but they've been put into

226:34

this graph based on the order that they

226:36

first appear over here in column A and

226:40

that's when they pop up so that's the

226:41

order so technically I could go through

226:43

and sort this column

226:45

alphabetically but that's going to take

226:47

a little bit too much time if you want

226:48

to do that feel free to try that out

226:50

anyway it looks like roles like machine

226:52

learning engineers and also software

226:55

Engineers have a pretty large inter

226:58

cortile range or that where that 50% of

227:00

that data Falls so there's a basically a

227:02

wide range of data or salaries you could

227:04

find with that whereas data nerds data

227:07

scientists data analysts and data

227:08

Engineers have a tighter band also as

227:11

expected those data analysts and

227:12

business analysts have some of the

227:14

lowest median salaries where something

227:16

like the data engineers and the senior

227:18

roles have even higher median salaries

227:21

overall this is pretty great at going in

227:23

comparing values I would probably work

227:25

with this more to fine tune it to only

227:28

have a couple of job titles in it and

227:30

for that we can use something like

227:32

slicers which will be covering in an

227:34

upcoming chapter well the next chapter

227:37

when we get into Advanced Techniques in

227:38

Excel so we'll be able to customize this

227:40

further once you have that knowledge all

227:42

right you now have some practice

227:43

problems to go through and get more

227:45

familiar with those histograms and all

227:47

scope box and whisker charts in the next

227:49

lesson which is a quick one we're going

227:51

to be moving into spark lines which is

227:53

the final lesson in this chart overview

227:56

with that I'll see you in the next

228:01

one moving into this last lesson on

228:03

charts focusing on spark line spark

228:06

lines are basically ways to insert mini

228:10

charts into a cell that summarizes data

228:14

that's next to it if your data is coming

228:17

in a horizontal form similar to this

228:19

table you probably have the possibility

228:22

of considering inserting a spark line

228:24

we're going to going through how to make

228:25

them but also customizing it all right

228:27

for this we're going to be using the

228:28

spark lines workbook for this we have

228:31

like usual our data Tab and then our

228:33

original tab that calculates data off it

228:35

and for this data set we're just looking

228:37

at what are the counts of the different

228:39

job titles based on month so this is

228:41

basically horizontally oriented this is

228:43

great for a spar

228:47

line so how we're going to do this well

228:49

we'll go ahead and select the data only

228:51

so C4 to n10 then come up into the

228:54

insert tab then right here we have this

228:57

section on spark lines we can insert a

229:00

line column or a win loss we'll just

229:02

start with column to start with and it

229:04

fills in for the data range C4 to 10 but

229:07

it wants us to choose where you want the

229:09

spark lines to be placed so the location

229:11

range and click this Arrow here and then

229:13

from there actually drag it next to it

229:16

all close this Arrow back and click okay

229:20

anyway I wanted to demonstrate that bar

229:21

chart because it's not really that great

229:23

for here remember anytime we're doing

229:25

continuous data in this case we're doing

229:28

that monthly data I'm going to want to

229:29

use something like a line chart instead

229:31

so I can easily change it by coming up

229:33

here selecting all of our different data

229:35

selecting that spark Line tab and then

229:37

just changing it to I can change

229:39

something like win loss which no really

229:41

data from this line chart that's what we

229:43

really want from this now getting into

229:45

the customization of this I really

229:47

personally I'm like blue so we're going

229:49

to stick with the blue color but we

229:51

could change the color if we want to and

229:53

the other thing we change is the marker

229:54

color right now we don't have any

229:55

markers on it we can actually change

229:57

which markers are right here in the show

229:59

selection right here so I can select the

230:02

high points right now it's going to

230:03

highlight all of them red uh low point

230:06

also red negative points there's no

230:08

negative point you also do the first

230:11

point which I don't really find much

230:13

value in that or last point and then

230:15

actual finally the markers itself you

230:17

just put every single one of them with a

230:18

marker I really like this High Point and

230:21

this low point and we can customize this

230:25

the high points I would really want to

230:26

call out to be a green color right now

230:30

this green that's sort of hard to see so

230:33

I'm going to change it to something a

230:34

little bit darker and Bam we can see

230:36

that one a little better the red for the

230:37

low point I'm going to keep it as is and

230:39

the last thing is all this data has

230:41

Bally a grid around it I'm just going to

230:43

add that in real quick by selecting all

230:45

the cells come up into home into the

230:48

borders I'm going to put in all borders

230:50

around it then it looks like I have a

230:51

double line right here for this lower

230:53

one so I'll insert this bottom double

230:56

border and then finally I'm going to put

231:00

a thick border around this all bam we

231:03

have our final visualization there now I

231:06

can go through and see things like okay

231:10

with that analyst and other analyst we

231:12

saw spikes in January but things like

231:14

thata Engineers we didn't see a spike

231:17

however all the job titles ran to a

231:19

similar problem where apparently they

231:20

ran out of budget and the least amount

231:23

of jobs were posted in November and

231:25

December so this's a pretty cool feature

231:26

to show some quick snapshots about the

231:29

data you're looking at right you now

231:31

have some practice problems to go

231:32

through and basically practice making

231:34

some of these spark lines we're going to

231:36

next be jumping in the next chapter it's

231:38

our final chapter of the basic section

231:41

and it's going to be focusing on

231:43

Advanced features inside spreadsheets

231:45

such as tables formatting and how to

231:47

collaborate with others it's our last

231:49

section before we build our first

231:51

project so with that I'll see you in the

231:53

next chapter on Advanced

231:58

spreadsheets then nerds welcome to this

232:01

last chapter in the basic section

232:04

focusing on Advanced features and

232:06

spreadsheets there's a last chapter

232:07

we're going to be covering before we get

232:08

into our first project and this chapter

232:11

is broken into three different lessons

232:13

this one right here is going to be on

232:15

tables how to use tables how to use

232:17

things like slicers and how to

232:18

manipulate them second lesson is on

232:20

formatting not just on making cells look

232:23

pretty but developing conditional

232:25

formatting rules in order to highlight

232:28

CES according to well a certain rule

232:31

pretty interesting feature within Excel

232:33

and the third lesson is on collaboration

232:35

for a project we're going to be making a

232:37

dashboard and so we need to enact

232:40

certain measures in order to protect it

232:42

and prevent people from going in and

232:44

messing it up and so we're going to go

232:46

over a lot of features in order to set

232:47

it up properly anyway back to this

232:49

lesson what are we going to be doing for

232:50

it well first we're going to start out

232:52

by using a smaller subset of our data

232:56

set basically 15 rows and creating your

232:58

first table we're going to be

233:00

manipulating it using custom formulas

233:02

that we really haven't seen before along

233:04

with using some other ones that we have

233:06

seen before in order to calculate totals

233:08

subtotals and Aggregates by the end of

233:10

this lesson we're going to be building a

233:12

mini dashboard to analyze that histogram

233:15

that we talked about in our previous

233:17

lessons specifically we're going to add

233:19

slicers to it in order to be able to

233:21

filter down and look at a subset of data

233:24

that we're most interested about and

233:26

that's all could be done without the

233:28

help of tables for this lesson we're

233:30

going to be using the tables workbook in

233:33

chapter

233:36

4 for this you're going to start in the

233:38

tables intro original sheet and then the

233:40

final one's going to be what we're going

233:41

to eventually get to all these are going

233:43

to be labeled similarly with the

233:44

original and final and we're all going

233:46

to be working with the original it

233:47

should look like the final when you get

233:48

done with this so let's dive into

233:49

creating our first table first thing you

233:52

have to do is make sure that we're

233:53

selected somewhere in here we don't

233:54

necessarily need to select the full

233:55

table but just somewhere in here from

233:57

there we'll go into the insert Tab and

233:59

we'll insert a table also notice that we

234:02

can use the shortcut control t for this

234:05

so I'm going to do that instead and for

234:07

this it automatically pinpoints the

234:10

rightmost cell and the bottom most cell

234:12

and we need to make sure we have this

234:13

check mark enabled of my table has

234:15

headers because we have well call them

234:17

headers and Bam we just made our first

234:19

table this lesson's over but seriously

234:21

let's actually get into exploring this

234:22

table design tab that now appears

234:24

anytime you're selected to the table if

234:26

I click off of it it disappears anyway

234:28

we're going to first look at the table

234:30

name and i' like to have a table name

234:32

that's easy to reference so I'm going to

234:34

just name it something like jobs it's

234:36

going to come into handy naming it

234:38

something simple whenever we're making

234:40

formulas later for this now we'll get to

234:42

this section in a little bit on tool and

234:44

external table data but I want to move

234:47

over to the style options you can play

234:49

around with some of these options here

234:50

where you can highlight the First Column

234:52

or you can highlight the last column has

234:54

a lot of different formatting options

234:56

with it but what I really like is this

234:58

color formatting if I'm not really

235:00

liking the color that it's given to me

235:02

just come over here select a new one so

235:04

we'll get back to table design in a bit

235:06

but what's really the benefit of this

235:09

table well one thing is you can easily

235:11

add data to a table and it will will

235:14

autofill let me show you let's say I

235:16

wanted to add a new column with a solid

235:19

year average copy whenever I enter this

235:21

new column name and press enter it

235:23

automatically fills this in I can the

235:26

skills are sort of covering this up

235:27

right now sorry about that and I can

235:29

make this a little bit bigger but you

235:31

can see we have salary or average copy

235:33

now included within this table and I can

235:36

verify that it's included also in this

235:37

table by if I want to go to resize table

235:39

it will say that now it goes to

235:42

k16 now for this I just want to copy the

235:45

results of the salary year average

235:47

column over here in h so what I'm going

235:49

to do is press equal to and I'm just

235:51

going to select the cell over here of H2

235:55

now this is what I was talking about

235:56

whenever I said tables have their own

235:58

unique formulas what it's going and

236:00

doing here is it's referencing the

236:01

salary or average column which is this

236:04

portion right here and then it's also

236:06

using this at symbol to basically refer

236:08

to this is the same point in the row of

236:11

H2 that is a K2 anyway when I go ahead

236:14

and press enter Watch What Happens we

236:17

actually fill in all the different

236:18

values of this so if I were to actually

236:20

double click into this one down here we

236:22

still have that same syntax of we're

236:25

selecting the Sal your average column

236:27

and we're using that at value value to

236:29

get the one that corresponds in that

236:31

same row now let's dive deeper into

236:33

these different formulas we can use for

236:35

this table so I'm going to come over

236:36

here into column n and for this remember

236:39

we named our table jobs so I'm just

236:42

going to type in jobs and I have two

236:45

tables in here one called job one jobs

236:47

you only have one popping in here anyway

236:49

it automatically pops up so I'm going to

236:51

select jobs and now whenever I do this

236:55

I'm going to press enter it's using our

236:57

modern dynamic arrays basically to fill

237:00

in all the data that we have over here

237:05

inside of our table so pretty unique in

237:07

how we can reference this now what

237:09

happens if we wanted to also include the

237:11

column headers up at the top well I can

237:12

type in jobs and then from there I'm

237:15

going to add a square bracket and we

237:18

have a few options popping up right now

237:20

it looks like it's just a column titles

237:22

but if we scroll down we have these

237:25

values here with hashtags in it

237:27

specifically I want with the column

237:28

headers so I'm going to put hashtag

237:31

headers I'm going to put a close bracket

237:33

on this and then press enter and now we

237:36

have the column headers across the top

237:38

now that's a little bit too much work

237:39

having to do two different formulas for

237:41

this if instead I wanted to do job and

237:44

then square bracket and see the options

237:47

available I can see I have an all a data

237:49

only a headers and a totals row totals

237:52

row we're going to get to a little bit

237:53

so we'll do the all for now and if I go

237:55

ahead and press enter bam we now have

237:58

our data with our column headers and

238:01

also the data itself but what happens if

238:04

you want to just access certain columns

238:06

well I thought you never asked that well

238:08

once again I can type in something like

238:10

jobs but the square bracket and then we

238:12

have a list of different columns

238:14

available let's do the salary year

238:16

average and do a close bracket once

238:18

again this is going to provide the data

238:20

values only if we wanted to include the

238:23

specific header for this I once again

238:25

need to put in jobs and this time I'm

238:27

going need to specify not only the

238:30

headers so I need to put this in its own

238:33

square brackets but I'm also going to

238:35

have to do a comma put another square

238:37

brackets and put salary year average

238:41

within its own brackets so it's almost

238:43

like a list of items if you're familiar

238:45

with python this would be like a list

238:46

anyway we have the headers in Brackets

238:49

and we have salary year average in

238:50

Brackets pressing enter we get salary

238:53

your average up at the top now honestly

238:55

an easier way to do this all is to well

238:57

use that all command or hashtag all but

239:00

it has to be put within its own square

239:02

brackets then from there a comma and

239:04

then we want to say hey the subset only

239:06

that we're providing for this is salary

239:09

year average close that bracket and then

239:11

close the entire brackets for jobs now

239:14

from there when we run it we get the Sal

239:16

year average along with all the column

239:18

values at any time if you forget that

239:20

it's not that big of a deal as you can

239:23

just go through and put an equal sign

239:25

and like we did previously I could just

239:27

highlight well not that um our salary

239:30

your average column and look it

239:32

automatically populates with that same

239:34

formula above here and when I press

239:36

enter boom it pops up there so don't

239:38

think you have to memorize these

239:39

formulas that I just went over but what

239:41

do all these formulas actually provide

239:43

any value value for well let's look at a

239:45

use case let's say I wanted to identify

239:48

jobs that whenever we looked at the

239:51

skills we could find out if they

239:53

contained the skill of Excel or not so

239:56

I'm going to create this new column over

239:57

here and call it Excel and for this

240:01

we're going to be using the search

240:03

function which we need to provide what

240:06

text we want to actually find

240:07

conveniently I put it in the column

240:08

header so I'll go ahead and just select

240:10

it and automatically populates the

240:12

formula for this then from that we need

240:14

to go to the next parameter of within

240:16

text we're trying to look at that job

240:18

skills column it puts that at symbol at

240:21

the front of job skills to basically

240:23

signify look at that row then from there

240:26

I'm going to go ahead and close the

240:27

parentheses and press enter so for that

240:29

search function it provides the N

240:31

numerical location of excel in here

240:35

Excel is 36 characters deep into this so

240:38

I'm just going to modify this cuz I

240:39

don't really care about the number of

240:41

that I'm going to say I'm going to use

240:43

the is number function which checks if

240:46

it's a number and then returns true or

240:49

false in this case we have True Values

240:51

so we know that for these columns if

240:54

they contain Excel or not they'll have

240:56

true so that's how I find myself using

240:58

these different formulas and

240:59

understanding how to actually manipulate

241:01

them anyway let's get into our next step

241:03

let's say we wanted to include some sort

241:05

of totals Row in order to maybe

241:07

calculate median salary how many job

241:09

postings there were Etc so we'll go into

241:12

this table design Tab and I'm going

241:13

going to select the total row and now

241:17

down here in row 17 we have total

241:19

written down here along with a bunch of

241:22

well blank values except for all the way

241:23

to the right looks like it puts us the

241:26

number of 15 which is the total of these

241:29

now going over to that salary year

241:31

average column I can basically select

241:34

this totals row right here and you

241:36

notice a drop down appears right here

241:38

from here we can select some basic

241:40

statistics average count min max

241:43

variance go ahead and select average

241:45

that's the average of this column right

241:46

here so pretty neat I'd go through and

241:48

if I wanted to do other columns as well

241:50

that now you can also go into here and

241:52

select more functions and then like we

241:54

said we want to calculate Median on this

241:56

salary we could go ahead and select this

241:59

function of median but I'm actually

242:01

going to recommend another approach you

242:04

see if we double click inside of here we

242:06

actually see that this totals column is

242:10

using a function specifically the

242:12

subtotal function function so let's

242:15

actually build this out from scratch

242:16

without selecting it luckily we have the

242:18

salary your average copy column over

242:19

here so I'm going to go in and I'm going

242:20

to type in subtotal and it returns a

242:24

subtotal in a list or database first is

242:27

the function number what do we want it

242:29

to actually do and this has even more

242:32

values available to it that you can

242:35

actually select from and perform on this

242:39

so in this case let's say I wanted to

242:41

find out what the max value is I would

242:43

plug this in it would be 104 and then

242:45

for the reference for this well we're

242:47

just going to select this salary year

242:49

average copy column it automatically

242:51

transformed into this special syntax and

242:55

then add a closing parenthesis and press

242:56

enter and so now we have the max salary

242:59

which looking at this it's true but if

243:01

we go back into this and actually

243:02

inspect what values are available in

243:05

this function number we can see that

243:08

median is not available in here so what

243:11

are we going to do well there's another

243:14

function we're not going to use median

243:16

but that I recommend instead of using

243:18

sub total and for this one we're going

243:20

to use the aggregate function and this

243:23

returns an aggregate in a list or

243:25

database it's similarly designed where

243:28

it has a function number but with this

243:30

one we have a lot more options including

243:34

things like CTO and stuff like that

243:35

anyway it has median available as number

243:38

12 now the second parameter on options

243:42

allows us to select a host of options uh

243:46

no pun intended for allowing us how we

243:48

want to actually perform this aggregate

243:50

basically do we want to maybe ignore

243:53

hidden rows or do we want to ignore

243:55

error values in my case I don't really

243:58

want to ignore anything so I'm just

243:59

going to do number four and then finally

244:02

we need to insert the array or the

244:03

column itself in this case we want

244:05

salary year average closing the

244:07

parentheses on this and pressing enter

244:09

we get our median value of 94,000

244:16

now depending how fast your computer is

244:18

you're going to run into some

244:19

limitations here I have in the table

244:22

limits original tab which is the next

244:24

one we're going to be working with in

244:25

this uh portion of the lesson it has

244:28

around well 32,000 which is in the data

244:30

set anyway we're going to run into some

244:33

limitations as I'm going to show I'm

244:34

going to encourage you to just watch

244:36

along uh me do this and then from there

244:39

basically decide if you think you have a

244:41

strong enough computer or not to

244:42

continue on to do this

244:44

um but if you have a pretty uh basically

244:46

slow computer I wouldn't necessarily

244:47

follow along with this anyway I'm going

244:49

to convert this into table by selecting

244:50

any portion in here pressing contrl T it

244:53

selected all the different values and

244:55

that table has CS so now we've converted

244:57

this into a table and one of the

244:59

benefits we haven't really discussed yet

245:01

is the ability to actually filter data

245:04

because it automatically provides this

245:05

filter up at the top now I'm going to go

245:08

ahead and filter this down based on a

245:10

data analyst job title and when I go

245:14

through and actually select this to just

245:15

select it at analyst and press okay it

245:18

runs pretty quickly but I have run into

245:22

problems in the past especially working

245:24

with smaller computers where it takes a

245:26

while to do this I'm working with about

245:29

24 GB of RAM on this virtual machine so

245:34

if you're something at like8 or even 4

245:37

I'm going to highly recommend that you

245:39

may not perform this exercise

245:44

so moving to this last exercise of this

245:46

lesson I've gone ahead and condensed

245:48

down this data set you can go into

245:50

histogram original and our previous data

245:53

set I basically shorn it down to these

245:55

four columns and limited to only

245:59

positions that have a salary year

246:02

average value listed basically if

246:04

there's blanks I remove those rows so

246:06

it's about 208,000 rows anyway this is

246:08

what we're going to be manipulating for

246:09

this this shouldn't lock up your

246:11

computer if you have a basically a

246:14

computer with less RAM and we're going

246:16

to convert this into a table first

246:18

pressing control T I select all the

246:20

values on here and press okay so now we

246:23

have a title now also in this sheet you

246:26

may have noticed hopefully that it's

246:27

been on the screen I have this histogram

246:29

here which is basically aggregating the

246:31

data from this Delta column on salary

246:33

year average anyway we're going to be

246:37

manipulating this further we want to

246:39

basically make this into a dashboard so

246:41

we can go through and maybe filter for

246:42

different job title different job

246:44

schedule types or different job

246:46

countries and it can be mildly

246:48

inconvenient to come up here and

246:50

actually select this arrow and then go

246:51

through and select the values want

246:53

that's why slicers are great so with our

246:56

table selected I'm going to go into

246:57

table design and then from there under

247:00

Tools I'm going go to insert slicer

247:03

we're going to be entering in both a job

247:04

tile short job schedule type and a job

247:07

country slicer so all three are here now

247:11

I'm going to go ahead and position them

247:12

make them look a lot neater all right

247:14

got them cleaned up and then from there

247:16

I can go ahead and actually select the

247:18

slider sir and if you notice this slicer

247:20

tab pops up conveniently labeled this

247:22

slicer has a caption on it or a title as

247:25

well and I can just rename it basically

247:27

to a better visually appealing title in

247:30

this case I want it to call job title

247:32

and then it updates here for job title

247:34

I'm going to do the same for the other

247:35

two updating it to schedule type and

247:38

then also Country Now by default this

247:41

slicer and all the slicers have all the

247:43

value selected so if I wanted to to go

247:45

in to actually select a value I could do

247:48

something like well we want to look at

247:49

data analyst I just select data analyst

247:51

it's going to clear all those other ones

247:53

and then only select that analyst as you

247:55

notice it took a second for it to

247:56

actually load that's why with this

247:58

20,000 rows of data even that's a little

248:00

high for tables I recommend it around

248:02

10,000 if you're using tables anyway we

248:05

have it filtered down to data analyst I

248:07

could also do it down to

248:09

fulltime along with filtering it for U

248:13

basically

248:14

uh I want to do United States if you

248:15

notice these values are gray out that

248:16

means there's no country basically

248:18

available with the current selections

248:19

that I have of data analyst in full-time

248:22

so that's what that means there but I

248:23

can go into that for United States

248:25

selecting it and Bam we now have our

248:29

final basically visualization but what

248:31

happens if I want to maybe look at

248:34

multiple different values what if I

248:36

maybe want to look at both data analyst

248:37

and business analyst well in that case

248:40

you want to select this box up here and

248:42

it allows multi I select and so I enable

248:45

it and now I can go through and select

248:46

something like business analyst and this

248:49

provides both those values along with I

248:51

wanted to look at full-time and also

248:52

part-time I could enable the multi

248:55

select on this schedule type and select

248:57

part-time and Bam now we have multiple

249:00

values selected for this along with the

249:02

United States and this makes the

249:04

dashboards that you're building a lot

249:05

more interactive and a little bit fun to

249:08

play around and to visualize the

249:09

different data all right we have some

249:11

practice problems for you now to go

249:13

through and dive into not only creating

249:15

tables manipulating them but also adding

249:18

and playing with slicers as well with

249:20

that we'll see you in the next lesson

249:21

we're going to be jumping into

249:23

formatting specifically conditional

249:25

formatting so see you

249:30

there in this lesson we're going to be

249:32

focusing on formatting and not just self

249:35

formatting where we're going through and

249:36

adding borders and colors but also

249:39

conditional formatting where a cell's

249:42

basically formatting highlighting will

249:44

update dynamically based on a value in

249:47

the first example we're going to focus

249:49

on Cell formatting specifically we're

249:51

going to go back to that table that

249:52

we've worked with previously that does a

249:55

count of data science jobs over the

249:57

month anyway we're going to go through

249:58

and actually format it using all the

250:00

different functions we can in order to

250:02

make it look pretty like I made it from

250:04

there we're going to move into our first

250:05

conditional formatting example where

250:07

we're going to look at basically

250:08

highlighting based on a job title those

250:11

that are basically high and those that

250:13

are low highlighting them appropriately

250:15

green or red and then in our final

250:17

example we're going to move on besides

250:18

using color scales to also using things

250:21

like datab bars and also icon sets to

250:24

make it look a lot more Dynamic we're

250:26

also going to go over best practices on

250:28

what not to do cuz sometimes you can go

250:30

overboard in how much you're actually

250:32

coloring a table and you can make it a

250:34

little distracting and and ultimately

250:36

not meet your goal for this lesson we'll

250:38

be working with our formatting notebook

250:41

in chapter 4 as usual all the data is

250:44

located in the little data Tab and we'll

250:46

be starting with the underscore original

250:49

of each of these sheets and then it

250:51

we'll get to in this case format

250:52

original we'll have what it looks like

250:54

format

250:57

final for the cell formatting we're

250:59

going to be using this format original

251:00

sheet and we're going to be focused on

251:02

this Home tab here so I'm actually going

251:04

to leave it expanded and for this we're

251:06

going to make this to where well what

251:09

this table looks like by going through

251:10

and actually formatting using all the

251:12

different features in here so the first

251:13

thing we need to do is highlight it all

251:15

and actually remove the formatting so

251:17

with it all selected I can go to editing

251:20

and then clear and I can either clear

251:22

all which is what I don't want to do I

251:23

want to do clear format and Bam now we

251:27

have an ugly table that doesn't really

251:29

make a lot of sense now previously we

251:31

were mess with tables so I could

251:32

highlight from B3 to 010 and make this

251:35

into a table by coming up here to format

251:38

as table basically selecting the color

251:40

that I want saying that it has headers

251:43

and allowing it to update there's

251:45

definitely an option um but I'm not

251:47

necessarily a fan of this so I'm going

251:49

to clear this by pressing contrl Z Now

251:51

an underused feature of formatting is

251:53

this cell Styles tab right here so I'm

251:56

going to go ahead and select the months

251:58

up here basically the titles and for

252:00

cell Styles they actually have a lot of

252:03

pretty unique formatting you can see

252:05

happening in the background so I'm going

252:07

to try out in this case I'm going to try

252:09

out heading two which is pretty neat

252:11

because it makes it bold slight bigger

252:13

and it puts a little line underneath it

252:16

I could do something also where I

252:17

highlight all the rows over here and

252:19

then make this into maybe heading three

252:22

and then all these values in here are

252:24

calculations so technically I could just

252:27

highlight this all and for the cell

252:29

Styles I could come up to the top here

252:31

and select hey this is a calculation and

252:34

this not a bad looking table uh but not

252:36

necessarily all I want to do so I'm

252:37

going to just remove this all instead

252:40

I'm going to start with my months I'm

252:41

going to make them bold and also add a

252:44

light gray background I'm going to do

252:46

the same thing over here for the values

252:48

in my rows and then from here we're

252:49

going to get the actual column grid

252:51

lines put in I'm going to only select C3

252:54

all the way down to o10 I'm going to

252:55

show you why and I'm going to add an all

252:59

borders so this is NE it add it adds all

253:01

borders to it what I'm going to also add

253:04

this which will add a little bit of

253:05

flare to it is a thick outside border so

253:08

now we got a thick outside border around

253:10

all of this and I'm going to do the same

253:13

with this one of an all borders and then

253:16

a thick outside border now it did remove

253:20

that thick outside border that I had on

253:21

this line between B and C so I'm

253:24

actually going to go ahead and put that

253:25

back in by just clicking it next thing I

253:27

want to do is format these with a comma

253:30

so I'm going to come up here and well

253:32

add a comma and then unfortunately it

253:35

adds this space in here and makes this

253:37

table bigger than what you can see now

253:39

I'm going to first remove the decimal

253:41

places and then in order order to fix

253:43

this I'm going to highlight all the

253:45

different columns through here to

253:46

January and just double click on one of

253:48

them to make them slightly smaller

253:51

anyway it's still not fitting completely

253:53

on here and I want this to fit within

253:55

the view here so I'm just going to

253:57

select this all and I'm actually going

253:59

to make these values slightly smaller

254:02

and I'm not liking the positioning of

254:04

these it looks like it's lower now that

254:06

I made this smaller so I'm going to

254:07

actually Center this this do a middle

254:09

align basically move it up slightly all

254:12

right my OCD is no looking good all

254:14

right now this is looking good now the

254:15

last thing we want to do is add a title

254:18

to this basically describe what is this

254:20

table that we're looking at and I want

254:22

to insert this in up on the top row but

254:25

I basically want it centered over this

254:26

table so what I can do is highlight from

254:28

b11 and from there select up here for

254:33

merge and also I want to Center because

254:35

that's I want my text Center during this

254:37

and from there I put in hey this is the

254:38

data science job count tracker and for

254:41

the cell style I'll make this heading

254:45

one now let's get into conditionally

254:48

formatting this table and specifically I

254:52

want to say if I'm looking at data

254:53

analyst I want to be able to look across

254:55

here and see which ones are the highs

254:57

and the lows right now I have this grid

254:58

lines and I can see that based on the

255:00

green and red or the highs and lows but

255:02

I want to actually be able to see this

255:03

in this table right here and so

255:04

underneath the Home tab we have this

255:06

conditional formatting available we're

255:09

going to focus on these three right here

255:10

first and that is datab bars and you can

255:13

see if I put it in it's basically

255:14

looking like a you know like a bar chart

255:17

color scales allows us to do well

255:19

different color formatting with it and

255:21

then an icon set basically allows us to

255:23

put in a nice looking icon and we're

255:25

going to stick simple for now we're

255:27

going to do color scales right now I

255:29

have C4 through N4 selected I'm going to

255:32

go ahead and select this green to Red

255:35

which is not bad if we're looking this

255:37

right this is doing exactly what I want

255:38

I want August which is the highest to be

255:40

highlighted green to attract my eyes to

255:42

it and then I want the red to be

255:43

November and December cuz a Lis I want

255:45

to attract attention to it but we want

255:46

to highlight the entire table here so if

255:50

I were to actually select the entire

255:52

table if you will from C4 all the way

255:54

down to n10 go into conditional

255:56

formatting color scales and do the same

255:59

thing you're going to notice it

256:01

basically does these bands but it does

256:04

this entire

256:06

table all formatted together and this is

256:09

not what we necessarily want of course

256:11

the total road is going to be the

256:13

highest I want to look through that row

256:15

and actually see where I should be

256:16

actually looking so anytime we need a

256:18

clear mess with any rules we come into

256:19

conditional formatting and go to clear

256:21

rules you have clear from selected cells

256:24

or entire sheet we're just going to do

256:27

the entire sheet then we're going to go

256:28

back to where we were before of

256:30

selecting just the data analyst values

256:32

going into conditional formatting color

256:34

scales and I'm going to go to this green

256:36

white red I actually want to try to

256:38

limit as many colors as I do two is

256:41

enough so I'm going to go green white

256:42

red red and I really like this one

256:44

better now I don't need to necessarily

256:46

go through once again of selecting

256:47

senior data analyst doing this again

256:49

what I would do instead is I'm going to

256:51

select data analyst here and then come

256:54

into this home menu up here and you

256:56

notice this paintbrush this is a format

256:59

painter in the instructions it basically

257:01

says select the content with with the

257:03

format you like click format painter and

257:05

then select something else to

257:06

automatically apply the formatting so

257:08

from here I can just paint my formatting

257:11

on unfortunately this doesn't have a

257:13

shortcut so I have to go do go back up

257:15

every single time it removes their

257:17

marching ants reselect the format

257:19

painter and go through and select it but

257:22

now we have this formatted how I want it

257:24

where I can look at a certain Row in

257:26

this case I look at data analyst see

257:28

what some of the highest are and Senior

257:29

data Engineers I can see how they

257:31

contrast to the other job titles

257:33

additionally which going be jump into a

257:34

little bit more later is we can go into

257:37

manage rules and we can see the current

257:40

conditional formatting appli

257:43

right now I have show matting formatting

257:45

rules for current selection I'm selected

257:46

the top cell right up here so there's no

257:49

conditional formatting if I were to

257:51

change this to just this worksheet I can

257:54

then if I expand this down I can see how

257:57

this applies this this type of

257:59

formatting of the red white green

258:01

applies to each of the different cells

258:03

and if I needed to actually control what

258:05

cells are actually selected I could do

258:08

that I could have also gone through

258:10

instead of done that copy formatting and

258:12

pasting I could done a duplicate Rule

258:14

and modifying the code as well but I

258:16

decided to do my way instead anyway this

258:18

is where you need to go if anytime you

258:20

need a manage conditional formatting we

258:22

cck

258:25

okay let's crank this up a notch and get

258:27

into using some more advanced

258:29

functionality with conditional

258:30

formatting here we have a new table you

258:32

haven't seen before basically it has all

258:34

the different job titles the counts of

258:36

those jobs aggregated from our data

258:38

sheet the median salary what is their

258:41

work from home percentage or likelihood

258:44

based on the jobs and then finally I

258:47

have this job rank right here which

258:49

basically uses these cells that are

258:51

hidden right here that if we actually

258:53

expand it out goes through and

258:56

normalizes the values so in this case

258:59

the job count normalize it between zero

259:01

and one so this job count is 90 is the

259:04

highest so it gets a value of one where

259:06

it's the lowest gets a value of zero

259:07

anyway I did this for all the different

259:09

values and then from there provide a

259:11

certain waiting factor of like 0453 and

259:14

0.15 in order to wait it appropriately

259:17

this is all my bias and how I wanted to

259:19

actually do it so feel free to adjust it

259:21

to what you want anyway we have this

259:23

final job rank in order to assess based

259:26

on these three values and this is

259:28

commonly done especially in like kpis

259:30

and stuff like that so we're going to be

259:31

making like icons for this column so

259:33

let's get into formatting our first

259:35

column we're going to do job count first

259:37

and for this one I want to have data bar

259:40

so I'm going to come down into condition

259:42

formatting into data bars and we'll add

259:46

these data bars right here I like the

259:49

bars in this case because we're dealing

259:51

with a count and we can really see

259:53

especially data analysts scientist

259:55

Engineers they really make up the

259:56

majority of the data here so it really

259:58

draws your attention to it next up is a

260:01

median salary we're going to do similar

260:03

to last time maintain a color scale

260:06

we're just going to do this first one

260:07

right here where green is the highest

260:09

salary and red is the lowest and then

260:11

one more we're going to do that work for

260:12

home we're also going to do it in a

260:14

color scale but for this one let's

260:17

actually do a different color go into

260:19

more rules and in this case we have this

260:21

new formatting rule window right here I

260:24

have two colors just say I want to do

260:26

one color I'm going to do white from the

260:29

lowest value and then we'll do like

260:31

purple for the highest value anyway this

260:34

is all basically to show a point this is

260:37

becoming

260:38

entirely entirely too much visually

260:41

distracting if you're if you were to

260:43

give this to somebody else or a

260:44

stakeholder where are they supposed to

260:46

look and actually organize their

260:48

thoughts on where they should

260:50

potentially pursue a job right now I'm

260:52

thoroughly confused at looking at this

260:54

so let's clean this up a bit and for

260:57

this I want to make it to where I like

260:59

maintaining a solid coloro across so

261:02

that way you know like hey if this color

261:05

is darker or there's more of this color

261:07

I should be looking there so in this

261:09

case we'll make this job count we're

261:10

going to just clean it up slight

261:12

slightly for the data bars B going to

261:14

make this like gradient appearance cuz

261:17

then I feel we can see the numbers

261:18

better and it's not too visually

261:19

distracting for the median salary I

261:22

really my goal of this is to find jobs

261:25

that are look say greater than 100,000

261:27

so let's actually just make highlighting

261:29

that highlights those jobs that are

261:31

greater than this value in this case I'm

261:33

going to come to conditional formatting

261:35

and enter a new rule this new formatting

261:39

rule popup comes up against once again

261:41

and we have a select a rule type this

261:44

allows us to do things like format all

261:47

cells based on the value format only top

261:49

or bottom rank values format only values

261:51

that are above or below average I

261:54

personally like this one of use a

261:55

formula to determine which cells to

261:57

format and in this case I want to say

262:00

I'm going to collect this formula thing

262:02

right here I want to look at you can

262:04

just select the first item in the item

262:06

selected so I'm select D3 it's going to

262:09

go through and actually do all of these

262:10

don't worry we'll see and for that we

262:12

want to highlight those that are greater

262:14

than

262:15

100,000 and press enter and then right

262:18

now it doesn't have any format set so

262:20

I'm going to change this to format and

262:22

we can control a whole host of things

262:24

such as the fill border font and the

262:27

number formatting itself but we're going

262:29

to stick with that blue theme I'm going

262:31

to just come down in here and I'm going

262:33

just select this blue color right here

262:35

and click okay and then okay again now

262:38

you notice my formatting is not

262:40

appearing that's because we have

262:43

multiple formatting applied to a cell

262:45

which you can do so in order to fix this

262:47

we need to come into manage rules and as

262:51

we see we have both of these applied to

262:54

it so I actually need to select this one

262:55

and I need to delete this Rule and click

262:59

apply and then okay now we're running

263:02

into our second issue and I slightly

263:05

misled you earlier when I said that D3

263:07

works if we go back into manage our

263:10

rules and we see our formula right here

263:12

I'm going to double click it we don't

263:14

need to actually provide an absolute

263:16

reference to a D3 because it's actually

263:18

going to evaluate all those cells based

263:20

on D3 instead we want it to be D3

263:23

without the dollar sign so it's not an

263:24

absolute reference and therefore

263:26

whenever I click okay and okay again bam

263:30

now it knows appropriately to check the

263:33

actual cell that it's looking at within

263:36

the range on whether to highlight it or

263:38

not moving on to the work from home

263:40

we're going to keep this similar in that

263:42

not going to be purple though we're

263:44

going to change this to Blue instead so

263:46

going into manage rules we have the

263:49

actual color right here selected I'm

263:51

going to just go in and change this to

263:53

this color that we used previously and

263:55

click okay and then okay as well so that

263:58

way it applies it all right the last

264:00

thing is this the job rank itself and

264:03

for this we're going to be using icon

264:05

set

264:06

specifically I like this one over here

264:09

on ratings but this becomes a little bit

264:12

over helming when where we have this

264:13

rating and also the number next to it so

264:16

we can actually remove this number in

264:18

the column we go back into manage rules

264:22

we can double click on that icon set

264:24

Rule and we can even further customize

264:27

when these stars are appearing but I'm

264:29

going to just go ahead and get to this

264:31

portion where it says show icon only

264:33

this allows us to only show the value so

264:35

going into applying this bam it's now

264:38

showing the icon I want that icon

264:40

centered both vertically and also

264:43

horizontally so bam now whenever I look

264:45

at this I can see especially since it's

264:48

all one color my eyes really gravitate

264:51

to well data scientists and data

264:54

Engineers based on this full star rating

264:57

and more of the blue being in this

264:59

region and that's what I would hope

265:01

people would go to or gravitate to as

265:03

well when they're looking at it one

265:05

quick note in this condition conditional

265:06

format we didn't cover this highlight

265:08

cell rules where you highlight greater

265:10

than or less than or you do a top uh

265:12

bottom rule where you can highlight the

265:14

top 10% or top 10 you can also adjust

265:16

that number anyway I find that myself

265:19

more using custom rules instead by

265:22

coming in here into new rule and then

265:25

actually fine-tuning what I want to do

265:28

so with the practice problems I'd really

265:30

dive into actually relying on using

265:32

these type of options instead and so as

265:35

I desly hinted to you have some practice

265:38

problems now to go through and really

265:40

practice how to do formatting and also

265:42

more specifically conditional formatting

265:44

in the next lesson we're going to be

265:45

move into collaboration and covering how

265:48

to actually protect your workbooks and

265:50

your worksheets so that way whenever you

265:52

share these with co-workers or friends

265:55

they don't go through and actually mess

265:56

them up all right with that I'll see you

265:58

in the next

266:02

one welcome to this last lesson in

266:05

spreadsheets advance for we jump into

266:06

our project and this lesson itself is on

266:09

collaboration which sounds sort of

266:11

cheesy but in order to demonstrate what

266:13

we're actually going to be learning in

266:14

this lesson we need to actually jump

266:16

fast forward a little bit and jump into

266:18

our project so I'm going to open up the

266:20

salary dashboard which is located under

266:23

project One dashboard so here's the

266:25

dashboard that we're going to build in

266:27

it they have three boxes that you can go

266:30

through and select this is going to be

266:32

using data validation which we're going

266:34

to be learning about in this lesson but

266:36

it allows you to basically standardize

266:38

the inputs that we want somebody to

266:40

actually select in in order to get the

266:43

results and it prevents them from

266:44

putting in values that maybe don't exist

266:47

and then breaking our dashboard so for

266:49

each of these job titles country and

266:51

types we have an Associated

266:53

visualization for each showing the

266:55

salary by job title the salary by region

266:58

and then also salary by job type finally

267:01

at the bottom I have some I call them

267:03

kpi cards basically outlining certain

267:07

characteristics or certain indications

267:08

of the median salary what is the top job

267:11

platform and then what what is a account

267:13

of jobs but I can come in here and

267:15

select something like maybe I wanted to

267:16

look at business analyst and it's going

267:19

to filter Down based on this telling me

267:22

what their median salary is that

267:23

LinkedIn is probably the best place to

267:24

go to for this what are the different

267:26

types of rollers availables and what's

267:28

available in the job database so the

267:30

other feature we're going to be going

267:31

through besides this data validation

267:33

process that we can do right here is

267:35

actually protecting your sheets which

267:37

you can find this here underneath review

267:39

under protect but anyway if you try to

267:43

move these cells around you're not able

267:46

to at all so we're going to be able to

267:48

design this dashboard in a way that

267:51

other co-workers won't be able to

267:53

destroy it additionally if you notice

267:55

down here at the bottom there's only one

267:58

sheet in here there's actually other

268:00

Sheets if I go to unhide here there's

268:03

other sheets I'll just unhide one of

268:05

them we'll just unhide data there's

268:07

other sheets inside of here but if

268:09

they're not applicable to my co-workers

268:11

or stakeholders I don't need to have

268:12

them so I can hide them so that's the

268:15

another feature we're go on over in

268:19

this all right nothing be yaen let's

268:21

actually get into this lesson for this

268:22

we're going to be using the

268:23

collaboration workbook in chapter 4 now

268:27

we're going to be building out these

268:29

three sheets as we go along and as a

268:32

sneak peek in this first example we're

268:34

going to be building out this little

268:36

portion right here this is going be

268:37

basically preparing us for our project

268:39

so a lot of this work is going to be put

268:40

to good use anyway we're going to be

268:42

building the simple one right here I'm

268:44

going zoom in where we have based on the

268:46

job title we can go through and select

268:48

it so senior. engineer it's going to pop

268:50

up with our median salary so that's what

268:52

we're going to be building with this and

268:54

specifically we're going to be using

268:55

this feature of data validation so I'm

268:58

going to create a new sheet to start

268:59

with because I don't want to start with

269:01

the answer right there I'm going just

269:02

call it calculator I'm going to put in

269:05

job title here and then median salary

269:09

below I'm also going to bold these by

269:11

pressing B and then these are where next

269:14

to it in column C is where we're

269:15

actually going to use the actual control

269:17

of this now we need to get a list of job

269:20

titles to put in this so I'm going to

269:22

create a new sheet and call it

269:25

validation and basically what I going to

269:26

do with this is create a sheet of all of

269:31

the different job titles available

269:34

specifically I'm going to say this is

269:35

going to be from the column job title

269:38

short and we're going to be using in

269:39

order to get the unique values of it

269:41

well the unique function we need to

269:44

provide it an array so I'm going come

269:46

back over here down to column A2 use

269:49

control shift select all the way down

269:52

close the parenthesis press enter okay

269:55

so now we have all of our different

269:56

values I'm going expand this out I'm

269:58

also going to zoom in a little bit now

270:00

whenever I do this drop- down menu I

270:03

want it in some sort of order

270:05

specifically I wanted in probably what

270:06

is the highest count value I wanted it

270:09

appearing at the top and those that are

270:10

less likely down at the bottom so what

270:12

I'm going to do is actually just copy

270:13

this value right here because this is

270:14

what we actually want to use what we

270:16

want to do is a count ifs we want to

270:19

count based on a condition for the

270:21

criteria range we're going to be

270:23

providing that job title short column

270:26

from our table and then for the criteria

270:29

we're going to be selecting right next

270:30

to it A2 B there we'll just autofill it

270:33

all the way down and then finally we

270:36

want to now sort it by this so I'll use

270:37

job title short sorted from there we'll

270:41

use the sort function to then sort this

270:45

by the second column position in

270:48

descending order so bam this is more

270:51

like I want I want those data analysts

270:53

that scientist engineers at the top and

270:55

the senior roles and so on cloud

270:56

Engineers car Bel so we now have this

270:59

list available that we want to use for

271:01

data validation we speak of I'm going to

271:03

go back to the calculator tab that I

271:05

made and for this we're going to go to

271:07

the data tab specifically under data

271:10

tools they have this this selection

271:12

available where where data validation

271:15

actually is and now this is going to

271:18

allow us to well customize it right now

271:21

the data validation for this cell is any

271:25

value I can place any value into it I

271:27

could limit it to a whole number I could

271:30

limit it to decimals a list a date a

271:32

time a bunch of things we're going to

271:34

limit it to basically a list of values

271:38

and we need to basically so provide a

271:40

source for this so for the source we're

271:43

going to go in and select the validation

271:46

tab that we just made and I'm going to

271:48

select all the different jobs right here

271:50

and then press enter from here I'm going

271:52

to accept this and press okay now as you

271:56

can see we have this little drop down

271:58

right next to it and I have different

272:01

selections actually available of data

272:03

engineer if I were to go into here

272:05

because I have this uh set to data

272:07

validation if I was going to put in

272:09

something like data nerd which isn't

272:11

available and press enter it says this

272:13

value doesn't match the data validation

272:15

uh restriction defined for this cell

272:17

therefore I have to go in and retry so

272:18

so only values within there are going to

272:20

be able to work in this so now let's

272:22

actually get into calculating that

272:24

median salary and for this we're going

272:27

to create a new sheet similar to this

272:28

median salary sheet we're going to call

272:30

this one salary wrong spot need to

272:32

actually enter it down here and call

272:34

this one salary throw this all the way

272:36

over first I need the names of job title

272:40

short and all that kind of good stuff so

272:42

what I'll do is I'll come over to our

272:43

validation Tab and I've selected equal

272:46

to already I'm going to select these

272:48

cells right here press enter so now

272:50

they're all appearing here now I'm going

272:53

to calculate the median salary for all

272:56

these jobs I know our calculator or

272:59

dashboard has uh only one value that is

273:02

calculating a time but in our dashboard

273:05

we're going to build we're actually

273:06

going to build a graph with all these

273:07

median salaries so we just need to

273:09

calculate them now all the median

273:11

salaries and then basically calculate

273:13

using data validation and also an X look

273:16

up what the median salary is going to be

273:18

here so for this we're going to be using

273:21

the median function and specifically

273:23

we're going to be using that if inside

273:24

of it because median if isn't available

273:27

we first want to check does the job

273:29

title here of data analyst meet our

273:32

condition of the job title short so I'm

273:34

going to type in the table itself of

273:37

jobs and then the column of job title

273:41

short close bracket and set an equal

273:44

sign equal to A2 then I'm going to close

273:48

the parentheses on this and actually we

273:49

need to wrap all this in parentheses

273:51

because we have to do multiple different

273:53

conditions we're going to do some array

273:55

multiplication the other thing we have

273:56

to check is that the values are not

273:59

blank or not equal to zero so once again

274:02

I'll put in jobs again and we're going

274:04

to be using that salary year average

274:07

column and we want to make sure that it

274:09

doesn't equal to zero and so that's the

274:12

condition we're checking for and so now

274:15

what do we want to return if true well

274:16

we want to return the salary so we'll do

274:19

jobs and then salary year average I'll

274:23

then close the brackets on that then we

274:25

need to close one parentheses I can see

274:27

a red parentheses still and then a final

274:29

black parentheses NOS I'm good press

274:31

enter looks like I got it right on the

274:33

first try let's actually drag this down

274:36

boom this is pretty nice so now we have

274:38

all the median salaries for these

274:40

different job titles I'm also going to

274:42

take this a step further of actually

274:43

sorting this by the med CER because I

274:44

know I'm going to be actually

274:45

visualizing this in the Project's lesson

274:48

so we'll go ahead and sort this as well

274:51

sorting it on the second index in

274:53

descending order so now we need to

274:56

provide the value in this case data

274:57

Engineers there is selected we need to

275:00

provide based on this value the median

275:03

salary and I want to just calculate it

275:05

over here just in case I need to go back

275:07

to it so for this I want basically

275:10

125,000 to here right here in G2 so I'm

275:14

going to provide an X lookup and the

275:17

first thing is this lookup value right

275:19

we're going to look up the data engineer

275:22

in this now I'm not going to use a cell

275:25

reference of going over here of

275:28

selecting this cell of data engineer

275:30

which is calculator C2 I'm actually

275:32

going to escape out of this we're going

275:34

to stop this right here I want to go

275:36

back to this I actually instead because

275:38

I'm going to be referencing these cells

275:40

specifically well this what s right here

275:43

a lot I'm going to just rename this from

275:47

C2 to title so right now I can see that

275:50

it is named title so going back over to

275:54

that salary tab again now we can perform

275:57

our X lookup and for the lookup value

276:00

we're trying to look up the title for

276:03

the lookup array we're looking up

276:04

through this job titles right here and

276:06

then for a return array the actual

276:08

salary values so now we're getting that

276:11

data engineer value of

276:13

125,000 similarly I also want to name

276:16

this cell as well I'm going to name this

276:18

one median salary pressing enter boom

276:22

locks it in so now when I come back over

276:24

to my calculator tab I can just put in

276:27

here equal to median salary I'm also

276:31

going to go through and format this to

276:33

make this look

276:37

better so just playing around with this

276:39

I can see that I can put in something

276:41

like senior data analyst and then a job

276:43

the associated Med and seller is going

276:45

to come up with it but let's say now I

276:46

want to give this to a coworker right

276:49

how can I prevent them from going in and

276:51

potentially you know entering in this

276:53

cell and then breaking it well we can

276:56

come up here to review and in this case

276:59

we're going to select this of protect

277:01

sheet now the first thing you can do you

277:03

can set a password to unprotect sheet

277:06

I'm not going to put a password but say

277:07

you wanted to put one you could and then

277:09

we have these options for for what you

277:12

can actually protect whether that's

277:13

select lock cells or select unlock cells

277:16

to protect we're just going to leave

277:17

both of these checked for the time being

277:19

click okay and now while one we can see

277:23

that underneath protect here it now says

277:25

instead of protect sheet it says

277:26

unprotect sheet whenever I go through

277:28

this and say I want to change it any

277:30

value whatsoever I can't change it so

277:33

it's good because the numbers can't

277:35

change or the median tile can't change

277:37

but now I can't change B job title which

277:40

is a little bit of a pain so

277:41

unfortunately Excel doesn't necessarily

277:44

make this the easiest I'm going to start

277:46

over again and just click unprotect

277:48

sheet and what we want to do is we're

277:51

going to select all the cells in here so

277:54

with all the cells selected I'm going to

277:55

press control and unselect C2 then right

278:00

clicking it I'm going to go into format

278:03

cells now under this protection tab

278:07

right here we're going to notice we have

278:09

options for locked and hidden we want to

278:11

actually be able to lock all the cells

278:15

except for C2 we don't want to hide any

278:17

so we're not going to adjust that right

278:18

now but now we're going to have the

278:20

ability to adjust whether it's locked or

278:22

not this doesn't actually change

278:24

anything right now so if I go into here

278:26

yes I locked those certain cells but if

278:28

I were to type into here it's still

278:29

going to allow it to be changed so now

278:33

what I can do is go into protect sheet

278:35

and previously we had both of these

278:36

selected of Select lock cells and select

278:38

unlock cells and in this case because we

278:42

locked all the cells except for C2 we

278:45

only want to allow people to select the

278:47

unlocked cell of C2 so I'm going to

278:50

uncheck this click okay and now I can't

278:54

click anywhere else except for where

278:57

I've set up that data validation in this

278:59

cell and I can still change it and it

279:01

will manipulate the value now we could

279:03

also go through and protect the workbook

279:05

itself I don't necessarily manipulate

279:08

with this as much instead what would I

279:10

would want to do in in this case is

279:12

actually hide all these other sheets

279:15

with the exception of this calculator

279:17

and so I can do this by right clicking a

279:20

tab and selecting hide so I'm going to

279:22

go through and actually hide all of them

279:24

so now we have everything as shown by

279:25

this tab down here of calculator we have

279:28

every tab hidden except for that and if

279:30

I wanted it to

279:32

reappear or get a sheet to reappear I

279:34

would just right click it click unhide

279:36

and then it's going to allow me to

279:37

select which option I can unhide and and

279:41

if I do want to make it to where a user

279:43

can't go in and necessarily unhide

279:45

sheets well I can go in here and select

279:48

protect workbook once again I can enter

279:51

a password if I wanted to I'm going to

279:53

just set this up but now when I come

279:55

down here to rightclick it there's no

279:57

option to hide or unhide a sheet so the

280:01

entire workbook is now protected so I'm

280:04

not going to lie that was definitely an

280:06

advanced intro into Data validation and

280:08

also protecting your workbooks but I

280:10

promise it's going to just come into

280:11

great use for whenever we're building

280:13

this project which will we get to next

280:15

now we do have some practice problems

280:17

for you go through and just test out all

280:18

these different features and with that

280:20

we'll be jumping in the next lesson and

280:23

actually building this data science

280:25

salary dashboard with that I'll see you

280:27

in that

280:31

one all right let's now dive in and

280:34

build our first project with Excel which

280:37

is this data science salary dashboard

280:40

this project is going to combine

280:42

everything that we've used and learned

280:44

up to this point from formulas and

280:46

functions to charts and then even to

280:48

data validation we're going to start

280:50

first by looking at the dashboard itself

280:53

you can just go to the project One

280:54

dashboard folder and Open salary

280:57

dashboard workbook now in this right now

280:59

you're only going to see one sheet and

281:01

as you try to click around you're not

281:03

going be able to do anything so as a

281:04

refresher if you want to actually dive

281:07

in and see what's going on behind the

281:09

scenes you'll need to First if you want

281:11

to actually touch any of these points

281:13

actually go into the review Tab and

281:16

click unprotect sheet then you'll be

281:18

able to investigate how I name certain

281:20

cells and whatnot additionally if you

281:23

want to investigate any of the workbooks

281:25

that I worked on you'll need to go into

281:27

unhide and select the appropriate

281:30

workbook that you want to well unhide so

281:33

for this we're going to be building it

281:34

out section by section specifically

281:36

we're going to start up at the top

281:37

building these data validation drop-down

281:39

menus then from from there we'll go into

281:43

building the different graphs associated

281:45

with it and then finally we'll end up

281:47

with these kpi cards now powering each

281:50

one of these major topics I've built

281:53

individual seats so for things like jobs

281:56

I have all the jobs along with any key

281:59

information to then build the

282:01

visualizations in it so here is the

282:03

basically the table that I made in order

282:05

to show the graphic right here similarly

282:09

for Country I have all the different

282:10

countries and then they're Associated

282:12

Med and salaries and I use that to not

282:14

only make the drop down but also make

282:16

the graph same thing for type and then

282:18

finally for platform anyway that's just

282:21

a quick overview to make sure that

282:23

you're under familiar with how we're

282:24

going to be working through this but

282:25

let's actually dive into

282:29

it for this I recommend picking up where

282:32

we left off in the last lesson on

282:35

collaboration did a lot of work for that

282:37

so we're going to use this workbook

282:39

first thing I'm going to do once this is

282:40

open I'm going to go in and actually

282:42

save it as this final dashboard and I

282:44

recommend that during this you're saving

282:45

this pretty frequently so we don't lose

282:47

progress first thing I'm going to do is

282:48

start moving this around I basically

282:51

know where I want to get these different

282:52

titles of these drop downs and then

282:54

where I want to put the drop downs we're

282:56

not going to be using meeting salary for

282:57

a little bit so I'm just going to take

282:59

that control xit and place it down at

283:01

the bottom then take the job title put

283:04

it in C3 and then move the data

283:06

validation to right below that we'll fix

283:10

all the format add in when we get later

283:12

on it okay so we have the job title now

283:14

the next thing we need to jump into is

283:16

country and we'll be putting that right

283:18

under this portion right here for this

283:21

I'm going to create a new sheet and call

283:23

this country with all these sheets I

283:25

want to have them pretty much similar to

283:28

what the title is above it so in this

283:32

case here where we had median salary

283:34

it's actually the titles um you have

283:38

named it in the previous one salary so

283:40

let's go ahead and just name this title

283:43

anyway going back to that country tab

283:45

that's where similar to the title tab if

283:47

you see we first grab the names of the

283:50

job titles from there and then calculate

283:52

the median salaries for each we're going

283:54

to be doing something similar in the

283:56

country tab with first putting in the

283:59

country names and then from there

284:01

putting in that median salary but I want

284:04

to keep a similar format as in this

284:06

title case remember we actually pulled

284:09

this from the data valid ation tab which

284:12

we're pulling here so I want to keep

284:14

this consistent anytime we're creating

284:17

anything for those drop downs we're

284:19

going to make it here in this data

284:20

validation tab so I'm going to create a

284:22

column here called job country and then

284:25

in this I want to get the unique values

284:28

from our data set specifically that jobs

284:31

table it's still named that jobs table

284:33

and of that column job country go ahead

284:38

and close the brackets and then close

284:40

parentheses and now we have all of these

284:42

different countries not sure why but

284:43

this is bolded I'm going to go ahead and

284:44

remove that anyway I want this in a

284:46

sorted format I'm not going to

284:48

necessarily sort it like count like we

284:50

did here with the job tiles I'm just

284:52

going to sort it in alphabetical order

284:54

so I'm going to use the sort function

284:57

and I'm just going to identify that we

284:59

wanted to use

285:00

G2 hashtag and Bam now we have all of

285:04

this also name this appropriately of job

285:07

country sorted so now we have our list

285:11

we can go back into here and actually

285:14

put in the country for the data

285:17

validation portion we do that by going

285:19

to the data tab selecting data

285:22

validation and the values we want to

285:25

provide a list to this and for the

285:27

source we go back to that data Val

285:29

station tab close this out and we

285:32

basically want to select all these

285:33

values here so I'll just do control

285:35

shift down pressing enter we now have

285:38

everything all the criteria for this I'm

285:40

going to go and click okay and I get

285:42

this error message and there's a problem

285:43

with this formula for some reason I

285:45

guess when I move back it added this

285:47

extra sheet in here I'm not too sure

285:50

this extra data I can't even select in

285:52

here anyway just make sure it's only one

285:54

sheet there it's going to work fine

285:56

country is now in here I can s something

285:58

like Argentina next value that we're

286:00

going to be looking at is the job type

286:03

so part-time full-time whatnot with this

286:05

although we're not going to use it yet

286:06

I'm going to create a new sheet and call

286:08

it type and also move that to the end

286:12

but now we want to get the unique values

286:14

of job schedule type so I'm put in the

286:16

column here of job schedule type and

286:20

then from there we want to get the once

286:22

again unique values for this we're using

286:24

the jobs table specifically that job

286:27

schedule type column and Bam now you

286:30

will notice from this one this one it's

286:33

a little bit this needs some data clean

286:34

up with it there's a lot of values in

286:37

here like it sometimes it has combined

286:39

values like full-time part-time and

286:40

internship and and whatnot we really I'm

286:43

actually going to expand this colum out

286:44

we really just want the single values

286:46

from this so something like fulltime

286:49

contractor part-time internship and then

286:52

also temp work so the first thing I'm

286:54

noticing about the thing ones we want to

286:56

remove is that they contain the word and

286:59

so we'll first identify those that con

287:01

turn and we do this using the search

287:04

function which is a text function to

287:06

find text specifically we're looking for

287:08

that keyword of and with intext we want

287:11

to just look through the whole array so

287:13

we'll put in J2

287:15

hashtag and I got a little error message

287:18

I need to make sure I use double quotes

287:19

for the text itself and running this now

287:22

I have basically number values for where

287:27

the and is located at and it looks like

287:30

yeah it looks like we're good on

287:31

everything with the exception of the

287:33

zero which we'll get in a little bit

287:34

okay so we need to convert this into

287:36

basically Boolean values because we're

287:38

going to end end up using this to to

287:41

pull out that we want using a filter

287:43

function so we're going to wrap this in

287:46

the is number and we're going to get

287:49

false or true and whatnot anyway all

287:52

right so now we have false or true the

287:53

last thing we need to do is use well not

287:56

the last thing second last thing we're

287:57

going to use the filter function and in

288:00

this we provided the array so in this

288:03

case it's going to be J2 hashtag and

288:07

then for what we want to include is this

288:10

other array that we just did so I'm

288:12

going go ahead and close this and see

288:14

what we get returned back and we're

288:16

returning now only the values that have

288:20

and in it we actually wanted to do

288:22

opposite of that right we want the

288:24

values that don't have an and so in

288:26

order to do that we're going to fix this

288:27

entire statement right here for the

288:30

include portion we're going to wrap it

288:32

in a giant knot to turn everything

288:35

around add an extra parenthesis on the

288:37

end bam now we have full-time contractor

288:40

part time we got the zero in there

288:42

internship and temp work we just need to

288:44

remove this zero out of it so we just

288:47

need to modify once again this right

288:50

here this portion of this include we're

288:53

going to do some array multiplication

288:55

basically once again looking through and

288:57

making sure no values equal to zero so

289:00

I'm going to do a multiplication do an

289:03

opening closing parenthesis and

289:05

basically we're just checking whether J2

289:08

hashtag is not equ equal to Zer let's go

289:12

ahead and enter this boom now we have it

289:15

down to the values that we want for this

289:19

I'm going to name this appropriately job

289:22

schedule type sorted also for some

289:26

reason this is in this column we're

289:29

going to move it over looks like we're

289:30

buing one spacing anyway now we need to

289:33

go back to our basic calculator Tab and

289:36

we need to enter data validation in this

289:39

portion to make sure can select the

289:41

right type so going select data

289:43

validation once again allow values of

289:46

list and then for the actual Source

289:48

itself we'll go to that data validation

289:50

tab select all these values in here

289:53

press enter and enter okay so now we

289:57

have the type in here so all of our data

290:00

validation portions are now

290:05

built next thing up is moving into

290:08

building the three different charts here

290:10

we're actually going to start with the

290:11

country chart because it's the easiest

290:14

and a sneak peek of what data is

290:16

actually needed for this I can go to the

290:18

country tab inside my final salary

290:20

dashboard and all we really need to do

290:22

is for each country calculate the median

290:24

salary and then throw it into a map

290:26

graph so back to our Excel worksheet

290:28

first thing we need to do is get those

290:30

list of countries and remember we

290:31

already have that so I'm put equal sign

290:34

it's inside of our data validation here

290:36

with these sorted values I want all

290:40

these values here here so I'm going to

290:40

do H2 hashtag press enter we have all

290:44

them all so let's actually start

290:46

developing the formula for building this

290:49

out using only we're just going to

290:51

calculate first the median salary for

290:54

that country and then also remember in

290:56

the past we've have to filter out any

290:58

values that basically equal zero so for

291:00

that if condition for The Logical test

291:03

we're going to do we're going to have to

291:04

do array multiplication and for our

291:07

first array we're going to be checking

291:08

for the job country right so we do that

291:11

jobs table and specifically that job

291:15

country column and we want to make sure

291:18

that it's equal to basically A2 in this

291:21

case the country right next to it

291:23

additionally we want to check that

291:24

there's 9 zero vales and so we're going

291:26

to be checking the salary year average

291:27

column and making sure that it's not

291:29

equal to zero so now moving on to the

291:32

value if true we basically want to use

291:35

the salary year average column value

291:37

false not applicable here go ahead and

291:39

close this looks like we have a typo it

291:42

went ahead and added that extra

291:44

parenthesis and we have a median salary

291:46

now and go ahead and copy that all the

291:47

way down now this is great but remember

291:51

in our if I go here back to to the basic

291:54

calculator tab we also want to not only

291:56

filter for a specific country but also

292:00

we're going to need to filter for a job

292:03

title and also for a job type so we need

292:07

to include not necessarily the country

292:09

because we're doing it for each country

292:10

but we need to include the job title and

292:13

the type now in order to add that this

292:15

formula is going to get a lot longer and

292:17

it's now getting hard to read so I want

292:20

to actually I want to one I want to

292:21

operate in this formula bar if you press

292:23

control shift U it expands it out and

292:26

then from there you can actually change

292:28

it to the desired length that you want

292:30

so what I'm going to do now is actually

292:32

break this into new lines I can press on

292:35

a Mac you're going to press Alt Enter on

292:40

the Mac I'm pressing option return

292:43

anyway I've went ahead broken this into

292:44

different lines I've also inserted some

292:47

spaces in there to basically put in some

292:49

indentation so I can read it better

292:50

don't have to necessarily do that but

292:52

now I feel like this is much readable

292:54

for my eyes go ahead and execute this

292:57

and Bam we have all the results and if I

292:59

do a drag and drop all the way down all

293:02

the other ones are updated as well so

293:04

the first thing we need to add to this

293:05

is to check for the job title itself so

293:09

I'm going put a multiplication there go

293:12

to the next line pressing Alt Enter and

293:15

for this I want to check jobs

293:18

specifically I want to check that job

293:20

title short column and whether it's

293:23

equal to basically title remember we

293:26

created title so I'm going go ahead and

293:29

press enter and it looks like we have a

293:32

typo because I forgot to insert a

293:35

parentheses at the end press enter looks

293:38

like I misspelled the actual table at

293:40

itself my bad press enter again now I'm

293:43

getting this name error right here and

293:45

that's because of this title that we're

293:46

using if we go back to that basic

293:48

calculator and select that cell C4 right

293:51

here it's named titlecore exe and I can

293:56

inspect the different names assigned to

293:59

cells by going to formulas Define names

294:03

and then the name manager now I started

294:06

directly with this workbook before we

294:08

actually created all these variables

294:10

here so what we'll do is this I'm going

294:12

to go ahead and actually just delete

294:15

this titlecore ex that was just an

294:18

example that's why it says ex then from

294:21

there I'm going to just rename it I'm

294:22

going to select the cell itself of C4

294:25

and I'm going to change it back to title

294:28

okay now it's Title Here back to the

294:30

country tab uh we have this updated for

294:32

the title it's actually appearing now no

294:34

name eror and I'll go ahead and drag it

294:36

all the way down there's going to be a

294:37

lot less values for this cuz we're

294:39

further filtering this so I'm seeing

294:41

some num erors that's as expected all

294:43

right the last condition we need to now

294:45

take into account is this type right

294:48

here and we haven't named this cell

294:50

already so I'm selecting K4 and I've

294:52

come up here and I'm going to select

294:54

type and now I've rename that as type so

294:59

we can finish this formula off we wanted

295:01

to I'm going to do a multiplication sign

295:03

start a new line by pressing Alt Enter

295:07

then do open and closeing parenthesis

295:09

for this we want to check if the job

295:11

schedule type column is equal to type

295:15

okay I'm going to go ahead and press

295:16

enter for this looks we have a value I

295:18

expect a few more even filtered from

295:21

here okay not a lot now one note on this

295:24

this formula is perfectly fine for

295:27

checking the job schedule tyght I'm

295:29

going to make it slightly better and

295:31

actually slightly more correct if I go

295:33

over to that data validation tab I'm

295:36

going to press uh control shift U to

295:37

actually close that formul bar if you

295:39

remember

295:40

from our job schedule tites yeah we

295:42

narrowed it down to this list but

295:44

actually there were the true list is

295:47

this so what we actually need to do is

295:51

check if a value is in here so in our

295:56

case we want to check whether the type

295:57

is in here so if we select part-time we

296:00

will also match on this job type here

296:03

where it says full-time parttime or this

296:05

one here where it says full-time

296:07

part-time temp work and we can do that

296:09

using the search function so we can find

296:14

something like part time within text of

296:18

right here and it's going to give us

296:19

back a number and then if it's not there

296:22

if I were to actually drag it down to

296:24

something like third column it's not

296:25

there it's going to get a a a value

296:27

error so I'm going to come back into

296:28

this and expand out the formula bar and

296:32

I'm going to change this formula right

296:34

here to basically get that condiction

296:37

remember we want to use the search

296:38

function we want to find the text of the

296:42

type which is that variable that we have

296:44

for the job type and we'll be searching

296:46

the job schedule type column now

296:48

remember this is going to return back a

296:50

number of the position if it's there so

296:52

we're going to need to wrap this all in

296:54

a is number function and then put

296:57

closing parentheses so I'm going to

297:00

autofill this all the way down again and

297:03

it doesn't look like any values at least

297:04

in view actually changed underneath this

297:07

formul bar for right now so I'm going to

297:08

go ahead and hide it and then for this

297:11

when we go to plot it we actually need

297:12

to remove these numb values from here so

297:16

in order to do this I'm going to I'll

297:18

create this new one called job country

297:21

filter and we're going to be using the

297:23

well filter function and for this we

297:26

need to include the array so everything

297:29

from here downwards pressing control

297:32

shift down to select that and then what

297:35

do we want to actually include well we

297:37

want to check to include anything in

297:40

that b column so is a number we going to

297:43

check those values are equal to a number

297:45

so I entered in that b column then as

297:48

well all right let's go ahead and run

297:50

this and it looks like it has all of our

297:53

values I don't like the order I'd rather

297:55

it sorted this is just me preference I'd

297:57

rather the numerical values be sorted so

298:00

I'm going to wrap this all in a sort

298:02

function and this is the array we're

298:04

applying to it we want to sort it on the

298:08

second index and for for this we wanted

298:10

to put it in we'll say descending order

298:14

and well Puerto Rico has some of the

298:16

highest jobs may have to move there and

298:18

okay we're going to get into applying

298:19

this now I want to make sure that we

298:20

have the maximum amount of values

298:23

present there's a lot of countries

298:24

missing that I know we available so I'm

298:26

going to just select the most basic job

298:30

possible to make sure that we have all

298:32

the jobs that we can appear so so we'll

298:36

just select data analyst United States

298:38

fulltime okay now we can go about

298:40

selecting column d and e and then

298:43

inserting in our map now I don't want

298:46

this here so I'm actually going to grab

298:48

this map and then come over here and put

298:51

it in I'm only going to do some minor

298:53

cleanup right now I'm going to remove

298:54

the chart title and also leged but we

298:58

now have this chart map available for

299:00

countries that shows the median salary

299:02

one quick note you are going to have

299:04

this sort of warning right here if I

299:06

click on it and it says hey we plotted

299:08

74% of the location from the data with

299:11

high confidence basically some of the

299:13

countries in there couldn't align

299:15

properly in my opinion it picked out a

299:18

lot of the major countries so I'm really

299:21

fine with that I'm fine if I didn't

299:22

identify all of them 74 is good enough

299:25

back to the final dashboard so we made

299:27

this country map right here now we need

299:28

to make these other two one thing to

299:30

call out with this which I don't think

299:31

I've called out before if we notice

299:34

whenever we select a job so in this case

299:37

I'll select data scientist it makes that

299:39

barall are a darker color blue the way

299:41

your eyes go towards it and then you can

299:43

compare it to the other ones so how did

299:46

I do this well if I go to my jobs tab my

299:49

final jobs tab what I'm doing here is I

299:52

have all the median salaries which we

299:54

calculated already in ours but I added

299:56

this over here basically I have one

299:59

column without we have data scientist

300:01

selected right now so I have one column

300:04

without the value appear in and then one

300:06

value with it appearing in and then what

300:09

we'll do from there is just some

300:11

basically manipulation of the graph to

300:13

make it to where in this case data

300:15

scientist appears so going back to our

300:18

worksheet of our fancy Dancy dashboard

300:20

we have so far going to go to that title

300:23

sheet remember we already did all this

300:26

portion of the last section first thing

300:28

we do is well we need to do some cleanup

300:30

we need to get rid of this name error

300:32

also we are going to create those extra

300:34

columns right here for basically what

300:36

job title selected but we need need to

300:40

more importantly if I expand out the

300:43

formula bar we need to update this

300:46

median salary similar to what we do with

300:48

job type to not only take into account

300:52

the job title but also the country and

300:56

the job schedule type so I'm all for not

300:58

repeating our work I'm going to go back

301:00

over to the country tab select the

301:01

median salary and I'm going to basically

301:04

just copy all that portion that's in

301:05

there anyway I'm going to escape out of

301:07

that come back into the job job title

301:10

tab select B2 and I'll go ahead and just

301:14

press uh Alt Enter insert all that in

301:18

and then now I just want to clean this

301:20

up we do want this country which we're

301:22

going to have to

301:23

fix but we don't need these middle two

301:27

right here that we already basically

301:29

have specifically with the job country

301:32

though so remember this thing's

301:34

calculating the median salary based on

301:37

the job title selected in this col here

301:40

and column A so this A2 is going to work

301:42

here previously we were doing the same

301:44

thing with country we don't need to do

301:46

country anymore we need to actually put

301:48

in a variable of country which we

301:52

haven't created yet so I'm just going to

301:53

enter country in it's going to give me

301:55

an error this name error I'm going to

301:58

come back over to the basic calculator

301:59

tab select this and then rename G4 to

302:04

Country press enter come back to the

302:07

title tab we're no longer getting that

302:09

name error looks like it's executing

302:12

just right I'm going to go ahead and

302:14

drag it all the way down and we do have

302:17

an error in my formula I have this comma

302:20

right here this is supposed to actually

302:23

be an array right this whole thing is

302:26

supposed to be um an array so now let's

302:29

try it again press enter okay 990,000

302:32

for data analyst in the United States I

302:34

know that's true and now we're filling

302:37

it in for all the rest okay so we have

302:39

what we need I'm going close out the

302:41

formula bar and remember we want to

302:44

basically in one column if it has the

302:47

word data analist we want to not include

302:48

it and then another one we want to only

302:50

include that one so we're going to use

302:52

an if for this so if this value which

302:57

we're going to go ahead and lock the

303:00

column is not equal to the title then

303:05

we're going to basically display those

303:06

results which I'm going to lock the

303:08

column for this otherwise I just wanted

303:11

to display an A and not a value Okay g

303:14

to go ahead and enter this and it is dat

303:17

analyst so it's not going to appear

303:18

there but it will appear all the rest of

303:19

these and so I locked those columns so I

303:21

can just drag this over and now with

303:24

this other one I want to do the opposite

303:26

basically if it's equal to title I want

303:28

it to appear and then I'll drag and drop

303:30

it all the way down so these are the

303:33

values I want to plot so I'm going to

303:36

select D2 to d11 then holding control

303:40

also select these values right here go

303:43

in and insert recommended charts and

303:46

first one up is actually the one that I

303:48

want so we'll go ahead and insert that

303:50

so I'll take this chart and also move

303:53

that right here into the basic

303:56

calculator tab with this one once again

303:58

I don't want a chart title and I don't

304:01

want a legend the other thing are the

304:03

values the horizontal values down here

304:06

I'm going to go ahead and double click

304:07

on that scroll down here all the way to

304:10

number and we're going to do that custom

304:12

formatting that we've done previously if

304:14

it's not peering uh feel free to type

304:17

the code in but we're going to use this

304:19

to basically format it as with the

304:21

dollar sign in the front and then also

304:23

the k for the thousands place all right

304:26

the last thing is you know I don't like

304:28

to use a lot of different colors in this

304:30

so making sure the graph is selected go

304:32

to chart design and then into chart

304:34

colors right now it's set under colorful

304:37

which I think is awful default value I'm

304:39

going to come down here and select not

304:41

this monochromatic palette 4 five sorry

304:44

the but the monochromatic palette 12 and

304:47

that's because now data analyst will be

304:50

the darkest blue the other ones will be

304:52

light so that way my eyes go to that one

304:53

instead so now what we just did with the

304:55

job title we need to repeat it for job

304:58

type so a lot of copy and pase in so

305:01

we're going to move a lot faster with

305:02

this one because we've done most of this

305:04

before for this we're going to be

305:05

entering in the type sheet and I'm going

305:07

to go ahead and pull all those things in

305:09

from data validation tab now we need to

305:11

get the median salaries for that I'm

305:13

just going to come back over to the

305:14

title sheet come into here and actually

305:16

just copy this en typable formula then

305:19

expanding this out with control shift U

305:21

pasting this in here now we need to just

305:23

change this up slightly so for the job

305:25

title we need to actually use the job

305:28

title whereas conversely for the job

305:31

type we no longer want to use type we

305:34

want to use what's available in A2

305:37

pressing enter we get our value for

305:39

full-time 990,000 of data analyst that's

305:42

correct and then drag it on down I'm

305:44

going to go ahead and close this for of

305:45

the bar and for this I'm going to use uh

305:47

similar to what we did in that Country

305:49

Sheet in where we not only filter the

305:52

data to make sure we include is numbers

305:54

but also we sorted it and that's because

305:56

sometimes these values sometimes we may

305:58

not have values and we go back to this

306:00

type tab sometimes there may not be a

306:03

certain job schedule type so I'm going

306:05

to go ahead and paste this in now it is

306:08

working I know there will always be five

306:10

values so I'm going to actually change

306:12

this to B6 here and also B6 here and

306:17

press enter now I also realized I made a

306:20

mistake earlier whenever I went to the

306:22

title sheet this is only doing the sort

306:24

function and we may have a condition

306:26

where in certain countries they don't

306:28

have all these different job titles

306:31

available so we need to do its similar

306:33

Hill here as well so I'm going to paste

306:35

that formula into here and then adjust

306:37

it because I know there's always 10 job

306:40

titles so it's going to go down to 11 in

306:42

this case and 11 here we go ahead and

306:47

run that there's going to be no change

306:50

the one issue though is in this case if

306:54

I go back to that basic calculator it

306:56

doesn't do it in the order that I want

306:59

so going back to that title sheet I'm

307:00

going to change that sorting value from

307:02

a negative one to a one so that way it

307:04

goes in basically ascending order and I

307:07

need to do the same thing here here as

307:10

well in the type sheet where it's also

307:12

in ascending order cuz we're going to be

307:13

making the same graph all right similar

307:16

to last time I wanted to if the value is

307:19

selected I want it to be highlighted so

307:21

we need to make those same columns again

307:23

so if this is not equal to the type I

307:26

want the value to appear and it be na

307:29

because right now fulltime is selected

307:31

dragging it over and then adjusting it

307:33

for equal instead and then dragging it

307:36

down I do want it to appear if it's

307:38

full-time now I'm going to select

307:40

D2 D6 and then these values in f and g

307:47

once again we're going to go to insert

307:48

recommended charts I don't like these

307:50

clustered columns I prefer a clustered

307:54

bar chart so I'm going to take this and

307:57

then put it in here make similar format

308:00

and changes as well of removing the

308:01

title and then also the legend updating

308:04

the xaxis by going into numbers and

308:08

changing the format to a custom format

308:11

to using the K value instead and then

308:14

finally the actual color Itself by going

308:17

to that monoch chromatic the color

308:20

palette 12 so bam now we have a lot of

308:24

this made so I can go through now and

308:26

select say data data scientist it will

308:29

update for selecting data scientist and

308:32

then you see all these other values

308:33

update as well I can also select the

308:35

different type um part-time in this case

308:38

and then the values still remain the the

308:39

same it just changes the bar that it's

308:41

selected

308:44

to all right the last major thing before

308:46

we get into formatting we're going to

308:47

make these three kpi cards one is for

308:50

the median salary the next is for the

308:53

top job platform and then finally on the

308:56

job count itself for how many counts of

308:59

jobs for all of these now one quick

309:01

thing Excel doesn't necessarily have kpi

309:03

cards like if you use something like

309:05

powerbi or looker they provide cards to

309:08

this we're going to do some sort of

309:10

backdoor approach if you will to make

309:12

this into a kpi card basically I'm going

309:13

to insert in a text box and we're going

309:15

to put a cell equal to it you'll see

309:18

what we're going to do with it but the

309:19

main point is these values this value

309:21

itself is not as you can see it's a

309:24

rectangle it's not in a Cell per se but

309:28

it is calculated within the workbook

309:31

anyway what we're going to be doing I

309:33

don't need this down here this median

309:35

salary what we did from the last lesson

309:38

I'm gonna go ahead and delete this but

309:40

the first we want to calculate is that

309:42

median salary and we basically have it

309:46

already and I'm going to calculate it

309:49

right here in this column of I2 and for

309:51

this we're just going to use a simple x

309:54

lookup and the value we want to look up

309:57

is based on the job title selected so

310:01

title and the lookup array is this array

310:04

right here and then the final return

310:06

array is right next to it there's a

310:09

missing value right now because Cloud

310:10

Engineers is not available in the

310:12

currenc are selected so make sure you're

310:14

selecting the full values and we going

310:16

to go ahead and close it but we have now

310:19

the median salary so I'm going to

310:22

actually rename this I2 cell to median

310:27

salary and then going back into our

310:29

basic calculator tab remember I'm not

310:31

going to insert it into a sell in here

310:33

but instead we go into insert and then

310:38

illustrations and I'm just going to

310:40

insert a simple old textt box I'll drag

310:42

it right there now the thing is I don't

310:44

want to type inside of here what I'm

310:46

actually do is I'm going to select the

310:47

Box itself so you no longer have that

310:49

blinking cursor in there come up into

310:51

the formula bar up here type in equal to

310:55

median salary and Bam now if you notice

310:59

it copied the formatting that we

311:01

previously have right here as a cluster

311:03

number looking at right there it copied

311:05

the same formatting that we're using

311:07

here in I2 so what I'm going to do is

311:10

just go in here and change this

311:11

formatting to a currency with zero

311:13

decimal places and then once we have

311:16

this value actually updated go back to

311:18

basic calculator we can see boom looks a

311:20

lot nicer we'll adjust the formatting as

311:22

far as the size and stuff in a little

311:24

bit after we calculate all the other

311:26

ones the next one from our final

311:28

dashboard is the top job platform so

311:31

we've only calculated things associated

311:33

with the job title the job country and

311:35

the job type so we need to make a new

311:38

sheet and we'll rename it platform and

311:42

technically the column name is job via

311:45

and for this we need to get the unique

311:48

values of the job via column now for

311:54

this one we're trying to get the top job

311:56

platform so we're not necessarily doing

311:57

that based on what is the top median

312:01

salary on this I just want where are the

312:03

most jobs actually located so we're

312:06

going to be doing a count using control

312:08

shift U to expand the we've been using

312:09

this median with this if array in it

312:12

we've already built this out already

312:15

which this formula does so you could so

312:17

we're going to use this I'm going to go

312:19

ahead and copy it by pressing contrl C

312:21

coming over to platform and then pasting

312:23

it in with contrl v okay and instead of

312:26

median we're going to use count and the

312:30

only other thing we need to update on

312:32

this is we stole it from the job country

312:34

page is we need to update the job

312:36

country to be well country and we need

312:40

to check one more condition so we need

312:42

to add to this array I'm going press uh

312:44

Alt Enter to create a new line and we

312:47

want to check that job via is equal to

312:51

in this case A2 and we go ahead and

312:54

press enter looks like 10 were available

312:56

for Via script zip recruiter and then it

312:59

calculates all the way down now remember

313:01

our data set also has hourly data in

313:05

there as well so technically if you

313:07

wanted to which I'm going to I'm going

313:08

to remove move this condition right here

313:10

that we're checking that it's not equal

313:11

to zero basically it's also going to

313:13

include if there's a job that has an

313:15

hourly salary included so I'm going to

313:18

go ahead and backspace out of that press

313:20

enter and then from there drag and drop

313:22

it down and I can see we added a few

313:24

more values because of this I'm close

313:27

this formula bar control shift you all

313:29

right so now I need to sort these values

313:32

basically from high to low selecting all

313:35

the values using control shift down the

313:38

sword index we want to use the second

313:39

index and we want to put this one in

313:42

descending order cuz we want the highest

313:44

one up at the top and for this it looks

313:46

like snag a job is the highest anyway uh

313:51

this is what we want this first one

313:53

actually appearing in our kpi card but

313:56

if you notice all of these have via in

313:58

front of it so what I'm going to use is

314:00

a text function of substitute which

314:03

replaces existing test with a new text

314:06

and for our text in D2

314:09

the old text that I want to replace is

314:11

via with a space and the new text is

314:14

just a blank value so snag job is now up

314:18

the top this is what I want to be known

314:21

as we're going to rename this variable

314:23

to platform then we do the same thing on

314:26

our dashboard of inserting a text value

314:30

and for this I'm going to select it and

314:32

say that it's equal to platform all

314:35

right so snag a job and for this one

314:38

this one is well somewhat simple but in

314:41

our data validation tab we were in the

314:45

very beginning in the last lesson we

314:47

were calculating the count and we were

314:50

calculating a generic count of all of

314:53

them so we need to once again modify

314:54

this because we want the count based on

314:57

our three conditions here so what I'm

315:00

going to do is just basically steal it

315:01

from what we did previously go into that

315:03

B2 cell in the platform sheet go ahead

315:07

and copy this all and then then in here

315:09

I'm going to expand this formula out I'm

315:11

going to go ahead and replace that in B2

315:14

with this now a few modifications we can

315:16

make to this we're no longer checking

315:18

the job via column we're not trying to

315:21

check that for the count that was

315:22

specific to where we stole that from so

315:24

I'm going to delete that and also this

315:25

uh multiplication point and then this is

315:28

checking all of the things selected of

315:30

country title and type we're wanting to

315:32

check the count of a certain title so

315:36

instead of having title we'll put in a

315:39

A2 pressing enter we have a lower value

315:42

because we've the current filters are

315:44

lower and then we'll fill it all the way

315:46

down closing the formula bar out we now

315:49

want to get the count for whatever is

315:52

selected so I'm going to go to an empty

315:55

column over here right here and we're

315:57

going to be doing an X lookup again the

316:00

lookup value is what is the title that

316:03

we're using the lookup array is we'll

316:06

use this one right here and then for as

316:09

far as the return array right next to it

316:13

pressing enter boom get a value of 537

316:17

now just to be safe in case there aren't

316:20

any results like say it was zero or

316:22

something or not applicable it's going

316:23

to be basically not applicable I do want

316:26

to include if not found I'm going to

316:28

enter in no results and I'm going to do

316:31

the same thing underneath the title

316:33

sheet for where we calculated the median

316:35

salary put for no results

316:39

so I'm going go ahead we want to get

316:41

that count in there so we insert that

316:43

illustration again for us we're going to

316:44

insert a text box and that textbox is

316:47

going to be equal to count which I don't

316:50

think we actually named yet so I

316:52

actually need to go back to escape out

316:55

of this go back to the data validation

316:58

tab rename this count and then from

317:02

there with the text box selected I'm

317:03

going put that equal to count now for

317:06

each one of these text boxes I need to

317:08

go through and actually

317:09

as you can see the we have a text box

317:11

for the value but I actually want to use

317:14

a shape basically background to tell us

317:18

what we're actually performing or

317:20

calculation that this kpi is showing so

317:23

I'm going come in here and to insert

317:25

illustrations for shapes we're going to

317:27

keep it actually we'll say a rectangle

317:30

this time and then we'll go ahead and

317:32

draw it now for the shape format itself

317:36

I'm going to go to this one right here

317:38

basically a blue around with white on

317:40

the front and with these shapes you can

317:44

still put in text in here so I can put

317:46

in something like median salary and I

317:49

can open up the Home tab and I can

317:52

actually customize this further so I can

317:54

make this bold I can put in the center I

317:57

actually want Center top and I'm going

317:59

to make this slightly bigger by 20 point

318:02

also I'm noticing this box is a green

318:04

outline I don't really like that I'd

318:07

rather a blue outline so we have that

318:09

now okay so how do we get that number if

318:11

you notice the number is no long it's

318:13

hidden behind here we can do a couple

318:15

different ways but I'm just going to

318:16

rightclick this object and then under

318:19

shape format you can go to send

318:22

backwards specifically I want to send

318:23

all the way to the back now getting into

318:26

the actual text box itself if you notice

318:29

there's a little bit of a a box around

318:31

it I don't really like that I'm also

318:33

going to exp expand it all the way to

318:35

the edges I'm going to format this one

318:37

as well to be centered bold and then

318:41

we're going to make the font much bigger

318:42

on this and I'm going to once I like I

318:45

talked about remove that shape outline

318:47

right now it has a a light one I'm going

318:49

to say no outline okay so now it looks

318:52

like a kpi card copying this I'm going

318:55

to then make two more and for each of

318:58

these I'm going to send them back to the

318:59

back name appropriately to top job

319:02

platform and job count for this I'm

319:04

going to just copy this text box here

319:07

that has the median salary in it and I

319:10

just want to copy the formatting to the

319:11

other ones as well so we can

319:12

conveniently use this paintbrush this

319:14

format prer and I'll select this one it

319:17

disappeared I have to reselect it and

319:20

I'll also select this one if you notice

319:23

the names are cutting off so it's really

319:24

important that you extend it all the way

319:27

over same thing with the job count as

319:33

well now we're getting into the format

319:37

portion of actually just doing some

319:39

final touches on here I don't like grid

319:41

lines so under view tab I'm going to

319:43

select remove grid lines for each of

319:45

these charts I don't really like those

319:47

outlines I want it just to sort of blend

319:49

in to make it look like it's there so

319:51

for the shape outline I'm going to

319:52

change each of them to no outline up in

319:54

our data validation point I want to make

319:56

the spacing right I'm also going to make

319:58

these titles slightly bigger for the

320:01

dropdowns themselves I want them to

320:03

basically pop out so I'm going to change

320:06

this formatting I'm going to go to the

320:07

cell Styles and I really like this one

320:09

of input because it sort of calls your

320:10

eyes to what you need to go to I'm going

320:12

to make this G column slightly bigger

320:15

and then shift the type over some the

320:18

other thing I want to do is add a title

320:20

up here at the top for what this

320:21

dashboard actually does so I'm going to

320:23

select cells B1 through L1 I'm going to

320:26

do merge and center and I'm going to

320:28

change this to data science salary

320:30

calculator along with going to the cell

320:32

style we'll do heading one for right now

320:34

I want that to still be slightly bigger

320:37

okay now we're going to to start moving

320:39

stuff around but I want to get in it's

320:41

like its final form that I'm going to

320:43

give to colleagues and co-workers and

320:45

I'm going to give it with the Home tab

320:48

closed and also with if I view this can

320:52

remove headings so it moved the column

320:55

headers the A and the B and then the row

320:57

numbers as well so it looks like

320:59

everything's upda correctly one minor

321:01

thing this job count I want to make sure

321:03

after I select it fulltime I saw that

321:05

the formatting of the thousands with the

321:07

Comm is not there so going back into

321:09

that data validation tab I'm going to

321:11

select this go to home make it a comma

321:14

and remove all the decimal places okay

321:17

looking good all right now we need to

321:18

get this set up to give to colleagues I

321:21

don't want them to have all these other

321:23

tabs or all these other sheets so I'm

321:25

going to go through and actually just

321:26

hide the ones that aren't applicable for

321:28

them Additionally the sheet of basic

321:30

calculator doesn't really make sense

321:32

anymore cuz that was for that first

321:34

lesson I'm going to actually name this

321:35

to salary calculator now call could

321:39

still potentially go in and they could

321:41

mess up these formulas and so we need to

321:44

now protect our worksheet and we only

321:47

want them to be able to manipulate these

321:49

three cells so we're going to be going

321:52

through protecting the sheet but we need

321:54

to actually recall that we have to pick

321:56

what cells that we want to lock right we

321:59

need to select all the cells and I

322:01

preemptively told you to hide the

322:03

headings you need to go back into view

322:04

and show the headings again cuz we need

322:05

to be able to select this triangle in

322:07

the upper left hand order to select all

322:10

the different cells and then from there

322:12

holding control unselect these three

322:15

cells and then from there we're going to

322:17

right click in there go to format cells

322:20

under protection and we want to make in

322:22

that case that they are locked or

322:24

basically we are going to be able to

322:25

lock them conversely we need to escape

322:28

out of this and now select the three

322:31

cells that we want to unlock right click

322:34

go to format cells and for these we want

322:36

to make sure that they are not checked

322:38

for this so basically unlocked whenever

322:40

we go ahead and protect the sheet so now

322:43

whenever I go into review go to protect

322:46

sheet I want to be able to select unlock

322:49

cells once again if you want to enter a

322:51

password you can I'm going to click okay

322:53

so now I can't click anywhere else

322:57

except for where we have our data

323:00

validation so I can go through and

323:01

select things like data scientist and

323:04

turkey now I'm just going to add that

323:05

last final touch of removing the

323:08

headings

323:09

bam we have our dashboard now I promise

323:12

last last thing before we go I'm

323:15

noticing and you're probably noticing as

323:17

well if you're going through and

323:18

manipulating these values in this case

323:20

let's go from data analyst from previous

323:22

selected data scientists this me talking

323:24

in real time I want to show it takes how

323:26

long it takes to load and it takes

323:28

forever to load why is it doing this

323:31

this is not good for stakeholders

323:33

they're going to get annoyed if it takes

323:35

this long I'm going go ahead and unhide

323:37

some of our sheet repats specifically

323:40

that platform one now these formulas

323:43

that we're using um the array formulas

323:47

to calculate these values it's F so in

323:51

this platforms one we have like oh my

323:53

gosh in this case we have close to 200

323:56

oh no it's like slowing down even going

323:58

through this we're executing this

324:01

hundreds of times in here whereas if I

324:03

compare it to something like the title

324:06

sheet we're only running this you know n

324:10

10 times which I feel isn't that big but

324:13

if we're running this formula hundreds

324:15

of times it's going to slow down this

324:17

sheet so I have a quick fix for this and

324:21

it involves we're not going to

324:23

especially for this sheet here platform

324:24

sheets we're not going to use this um

324:27

array multiplication order to calculate

324:29

this instead we're going to use a count

324:32

ifs the first thing we're going to do is

324:35

check that the Java is equal to the

324:39

criteria one of A2 so basically job

324:42

platform is what it is says it is from

324:44

there we'll check the job title short

324:46

column to make sure it makes up with

324:47

title we'll check the job country is

324:50

equal to Country and then finally we're

324:52

going to check that the job schedule

324:54

type is equal to type and then we're

324:56

going to go ahead and execute this and

324:58

then we're going to autofill it all the

324:59

way down notice that 1490 it's actually

325:01

going to go down slightly to

325:04

1426 and that's because we've now

325:07

changed this condition inside of this

325:09

count ifs specifically if I go back to

325:11

that title sheet you remember whenever

325:13

we match for this we did a really

325:16

indepth search so if any job schedule

325:19

type contain those keywords we match to

325:21

it now we're only matching it if it

325:23

exactly matches but since this job

325:26

platform is just providing it's not

325:28

providing a numerical value it's

325:30

providing what is the Top Value I don't

325:31

think the Top Value is going to change

325:34

that much so I don't think we're being

325:36

inaccurate about this if we change this

325:39

formula anyway going back to the actual

325:42

dashboard itself now whenever I change

325:43

this from data analyst to data scientist

325:46

it is much faster so now I'm go ahead

325:49

and hide those sheets and we are done so

325:53

that was a heck of a lot of work so in

325:56

the next lesson we're going to be

325:57

getting into how you can actually go

325:58

through and share this dashboard

326:01

specifically for those that have a

326:02

Microsoft description you can use

326:03

something like Microsoft online because

326:06

it has all these features that we have

326:07

within here and host it there for others

326:10

to use additionally we're going to get

326:12

into my recommended method of sharing

326:14

any your projects and that's via linked

326:16

in now just a heads up we will be

326:19

getting into git and GitHub after

326:22

project 2 at the very end of this course

326:26

and during that portion we'll talk about

326:28

how to share not only project 2 but also

326:30

this project here but that's more

326:32

complicated and I really want to focus

326:34

on Excel so with that we're going to be

326:36

shifting in the next lesson to quickly

326:38

share it and then moving into the

326:39

advanced chapter all right with that

326:41

I'll see you in the next

326:42

[Music]

326:46

one first up congratulations on

326:49

completing your first project in Excel

326:52

and building this salary dashboard been

326:55

nothing short of your hard work and you

326:58

shouldn't let that hard work go

326:59

unnoticed so in this lesson we're going

327:01

to be going over different methods you

327:03

could go about actually sharing this

327:05

project to your social network and to

327:07

others to help out in the job search or

327:10

future employment now if you were just

327:12

learning these skills for fun you had no

327:14

intent getting a new job or increasing

327:17

your pay in your current job then you

327:19

can feel free to skip this and go to the

327:21

next chapter on pivot

327:25

tables so there's a few different ways

327:27

you can go about sharing your work that

327:29

you did we're not going to go dive into

327:31

deep any of these we're going to look at

327:32

these more at a high level before

327:34

jumping into one of the options first up

327:36

is a portfolio website here I have luk

327:38

bru.com and if I wanted to I could come

327:41

inside of here and edit it and include

327:43

my project here along with what I did

327:46

for others to see another option even if

327:48

you don't have a big following on

327:50

YouTube is you could actually go in and

327:52

record and describe what you did within

327:55

your dashboard and host it somewhere

327:57

like YouTube now for both those options

327:59

you may be like Luke how do I actually

328:01

actually share my Excel file that

328:03

actually went through well that's where

328:05

we run into a little bit of issues as as

328:08

yes we created this Excel file right

328:10

here but how do you actually go about

328:13

sharing it with others to see your work

328:16

well one option for this is actually

328:18

hosting your file online via something

328:21

like one drive which if you're paying

328:23

for a subscription of Microsoft service

328:27

you have access to one drive and you can

328:29

host your dashboard online all I need to

328:31

do is navigate to One drive. live.com go

328:35

to this add new and files upload from

328:37

there select my file that I actually

328:39

want to upload online and then we can go

328:42

to it and our file is actually uploaded

328:45

here which we can actually go through

328:48

and select something like data

328:49

scientists and it will actually

328:51

calculate based on the changes we make

328:53

to it now one note the country chart

328:56

inside of excel online doesn't work but

328:59

I have a fix for it and mainly it's to

329:01

just remove it you go into the review

329:03

tab under protection and go to manage

329:06

protection and then you turn off sheet

329:08

protection then from there you can

329:10

delete it next all you need to do is

329:13

just take those charts and actually

329:15

extend them over so way they take up

329:17

that extra space and then once you're

329:20

complete with that turn back on the

329:21

sheet protection and now you can go

329:24

about actually sharing this so here I'm

329:26

coming into share and you can add an

329:29

email if you want or if you just want to

329:31

share it in general with a link you can

329:33

come down here and fine-tune the control

329:36

of a link to provide in this case I'm

329:38

selecting that I'm going to share with

329:40

anyone they can edit it you could make

329:43

it view but then they can't change the

329:44

dropdowns so I recommend that you still

329:46

leave it on edit you could set an

329:48

expiration and even password and then

329:50

from there click apply and now you have

329:53

a link to your dashboard that works even

329:57

if you don't have a Microsoft account so

329:59

here I am in incognito mode within my

330:01

browser so I'm not signed in at all and

330:03

I can actually go in and access this

330:06

dashboard and go through and select

330:09

something and it updates in real time

330:11

and because I got that sheet protection

330:13

on they can't go through and change

330:14

anything except for these dropdowns

330:16

don't believe me you can check out my

330:18

project via the link below but what

330:20

happens if we want to not only maybe

330:22

share our file but also write up what we

330:26

did the work we did with this and all

330:28

the different skills that we used well

330:31

that's the case of using something like

330:34

GitHub GitHub provides a location to

330:37

store Excel files like shown here along

330:40

with giving you the ability to go

330:41

through and perform a write up detailing

330:43

all the different work that you did now

330:45

if you wanted to see this you could just

330:47

navigate over to my project where you

330:49

download all these files from on GitHub

330:52

navigate into that project

330:53

one-board and in here has our Excel file

330:56

and also this read me which then appears

330:59

actually underneath here and details all

331:01

the different work that we did for this

331:03

now getting this project onto GitHub if

331:06

you're not familiar with GitHub up is

331:09

fairly complex we're actually going to

331:11

be saving this for after project 2 and

331:14

in that case navigating back to the

331:16

project itself we'll not only be

331:18

uploading project one we'll also be

331:20

uploading project two as well so after

331:23

we finish the last chapter chapter 8 on

331:24

power pivot we'll be getting into all of

331:26

this and you'll be learning more about

331:28

git GitHub and how to manage a

331:33

projects now from what I found working

331:35

in data science it's that the best way

331:38

to share your work and your project and

331:40

potentially collaborate with others is

331:42

use something like LinkedIn a social

331:44

media platform for networking in order

331:46

to share your project specifically here

331:48

I am on my profile right here and if we

331:51

scroll on down they have a section in

331:53

your profile to basically show all your

331:56

different projects that you've worked on

331:58

and contributed to and adding a project

332:00

is super simple I got to do is click

332:02

this plus icon include a description in

332:05

my case I was trying to help out job

332:07

Seekers inves salaries for their desired

332:09

jobs put in a few skills up to five of

332:12

Microsoft Excel data analysis or Excel

332:14

dashboards now for media they do have

332:16

options to add a link or media in the

332:19

case of the media it doesn't support

332:21

Excel files and then if you try to

332:23

insert your one Drive Link I ran into

332:26

errors so I find the best way to

332:27

actually just share the link is to post

332:29

it inside of the description from there

332:32

specify when you start and stopped on

332:34

this project anybody that contributed to

332:37

it this or anything that is associated

332:39

with and then from there click save the

332:42

other option that I recommend is

332:44

actually just going in and making a post

332:47

here I just write up a short little

332:49

description of what you did with your

332:50

project and then if you want include

332:53

something like an image or even

332:55

something like a gif which shows an

332:56

overview of the project and then

332:58

probably the most important thing is

333:00

actually sharing that link to your one

333:02

drive online you can also Post in the

333:03

comments and not include in the

333:04

description it's really up to you anyway

333:06

go through there and then post

333:08

so bam that's how you share your project

333:11

as a reminder we will be going into

333:14

greater detail into how to share both

333:16

this project and also the second project

333:19

on GitHub using git and also use things

333:21

like markdown in order to write about

333:24

your project but that'll be included

333:26

after we go through all of the different

333:27

Excel content just wanted to have a

333:29

quick way of you going through and

333:31

actually sharing what you've done so far

333:33

cuz I know you're probably excited and

333:34

proud of it all right in the next videos

333:36

we're going to be shifting gear into the

333:38

advanced chapters getting starting off

333:41

first with pivot tables with that I'll

333:44

see you in

333:48

there all right welcome to the advanced

333:52

chapter and because we're get into the

333:54

advanced section you know it's time for

333:56

a new

333:57

flannel and with this Advanced chapter

334:00

we're going to be focusing on a few core

334:03

topics that I think is going to make

334:05

your life a lot easier specifically

334:07

we're f focus on things like pivot

334:08

tables power query and also power pivot

334:12

all of these are great at automating my

334:15

Excel workflows to make it a lot easier

334:18

to do repetitive analytics that my boss

334:21

may come to me back and back again for

334:23

instead of with something like a formula

334:25

where I have to go through and make and

334:27

copy and paste that formula all over

334:29

again and rerun that whole analysis

334:32

these Advanced chapters are going to

334:33

make your life a lot easier anyway in

334:35

this chapter we're going to be focused

334:36

on pivot tables this lesson specifically

334:39

will be getting an intro into pivot

334:41

tables how to make them how to

334:43

manipulate them how to even read them in

334:45

the next lesson we'll be going into

334:47

advanced pivot tables looking at things

334:50

like grouping and even aggregating such

334:52

as getting percentages of grand totals

334:55

and whatnot and then the final lesson in

334:57

this chapter is on pivot charts which

334:59

allows us to basically take what we have

335:01

in our pivot tables and convert it into

335:04

a usable chart hence the name pivot

335:07

chart all right so let's actually get

335:09

into it and understanding why these

335:12

pivot tables are so

335:16

important so in the basics chapter we

335:19

made this table right here which uses

335:23

hardcoded values for the different job

335:25

titles along with the different months

335:28

and then from there uses formulas

335:30

specifically some product along with

335:32

some array calculations in order to

335:35

calculate how many job counts per month

335:38

this is cool and all but what happens if

335:41

we wanted to add another job title so

335:44

say we have like some like business

335:45

analyst or we have software developer

335:48

we'd have to actually manipulate and

335:49

upgrade all these different formulas

335:51

that we have here well here's that same

335:53

table but in a pivot table and by its

335:58

name that's what they're great at

335:59

they're great at pivoting and thus

336:01

aggregating data based on certain values

336:04

and whatnot so what is if we want to add

336:07

more job title this well I can just come

336:09

in here similar to how we manipulate a

336:11

table select this filter dropdown and

336:13

then go from there and select things

336:15

like oh I want to include something like

336:16

a business analyst and then the data

336:19

automatically updates for this no

336:21

readjusting formulas makes it super

336:23

simple I can even take this table a step

336:25

further and if I wanted to I can

336:28

actually filter by the job country in

336:30

this case I'm filtering by the United

336:32

States and we now have these values

336:35

makes it super simple anyway we're

336:36

getting ahead of ourselves we actually

336:37

need to get into creating our first

336:39

pivot

336:42

table all right so for the advanced

336:44

chapters it's going to be a little bit

336:46

different for what files you're going to

336:47

use for this the final results of this

336:50

lesson will be in the lesson title of

336:53

pivot table intro but what I want you to

336:55

do whenever you're going through or

336:57

following me along in this lesson is

336:59

actually revert back to the previous

337:01

file of the last lesson in this case or

337:04

the first lesson so we don't have one so

337:06

I have this one called zero of just

337:07

pivot tables that's the one you want to

337:09

start with so in this case pivot tables

337:12

itself just has the data tab of the data

337:14

we want to work with and this sheet of

337:16

the table that we've been familiar with

337:17

in Basics chapter which by the end of

337:19

this we're going to make a pivot table

337:21

out of and when out of I mean actually

337:23

of the core data itself anyway for the

337:27

actual pivot table intro this will have

337:29

also those similar tabs but then also

337:32

the lesson itself will have all the

337:34

different work that we've actually done

337:36

to complete what we need to do so feel

337:38

free to just have both of these up

337:41

during a lesson so that way you can

337:42

consult back and forth in case you get

337:44

lost all right so let's get into our

337:46

first pivot table we're going to be

337:47

using the data that we previous been

337:48

using of all the salary data for those

337:51

job titles anyway if I go into the

337:53

insert tab up here in the top left hand

337:56

corner I have pivot tables but I also

337:57

have recommended pivot tables if I don't

338:00

have an analysis in mind I could come

338:02

into recommended pivot tables a Pan's

338:04

going to appear on the right hand side

338:06

and notice here that it actually

338:08

selected the data range I know that's

338:11

the data range and it goes through and

338:13

provides some recommended different

338:15

pivot tables that you could put into

338:18

here whether you put it into a new sheet

338:20

or an existing sheet but I know what

338:23

analysis I want to do specifically I

338:25

want to do a count of the different job

338:29

titles so data engineer I want to find

338:31

the accounts of this senior data analyst

338:33

and so on right now it's not providing

338:35

any of that I don't typically find that

338:36

any time with recommended pivot tables

338:38

that it provides me what I want so I

338:39

don't find myself using that often

338:41

instead I go directly into pivot tables

338:44

right here and then we have three

338:47

options but we're really going to focus

338:48

for this lesson and this chapter is from

338:51

table or range I'm selected inside of A4

338:55

right now but it automatically knows

338:58

that this is the data range all the way

339:00

down to the bottom the other thing it

339:01

says is choose where you want the pivot

339:03

tail to place you can either do a new

339:04

worksheet or you can do inside the

339:07

existing worksheet but you have to

339:08

specify a location we don't want that I

339:11

typically like it in a new worksheet to

339:13

keep my analysis in one standard

339:15

location the last thing it asked is

339:17

whether you want to analyze multiple

339:19

tables specifically add this to the data

339:22

model we're going to be going into Data

339:24

models very heavily in the power pivot

339:28

chapter or chapter eight or last chapter

339:30

this is a super powerful feature when

339:32

you have multiple tables you need to

339:33

combine it we're not doing it in this

339:35

lesson or in this chapter so we're going

339:37

to leave it unchecked so now I'm in this

339:39

new sheet that I'm going to rename to

339:42

job count and I'm also going to move it

339:45

over here to the end anyway this pivot

339:48

table this pivot table 2 that is calling

339:50

it is there's nothing in it right now

339:52

and you notice there's a few things that

339:54

popped up first is the pivot table

339:55

analyze tab which is available with this

339:58

and also the design tab we'll be going

340:01

into these in some upcoming examples

340:03

that we're going to get into we're

340:05

however going to be focusing on for this

340:07

example example on the job count I'm

340:08

going to close this out on this pivot

340:10

tabl Fields pane right here now the

340:14

layout of this you may see it's somewhat

340:16

different is we have the columns over

340:19

here on the left so if you remember the

340:21

job tile short column job tile column

340:22

job location and then these fields on

340:26

the right hand side are things for like

340:29

filters row columns or values so I can

340:31

take the job title short column put into

340:33

something like the rows and get

340:35

basically all the values in the rows now

340:37

your layout may be a little bit

340:38

different if you come up and select the

340:40

tools icon right here you may be under

340:43

this Field section and area section

340:46

stacked which has the feels down here on

340:48

the bottom I personally don't really

340:51

like this because look how short my

340:53

column titles are so I like having them

340:57

like this instead anyway I think we

340:59

understand this columns area right here

341:01

but I don't think we understand these

341:02

filters rows columns and values so let's

341:05

explore this by calculating the counts

341:08

of these different job titles now

341:10

anytime I add something to the rows or

341:12

any of these columns I can either remove

341:14

it by grabbing it and pulling it off

341:16

notice they have the x mark on it or

341:18

similarly I can also just come in here

341:21

and click the uncheck Mark box that's

341:23

more applicable if especially for having

341:26

it in multiple different panes and want

341:28

to move it completely makes it simple

341:30

besides rows we also have columns and so

341:32

instead of the job titles being in rows

341:35

they're in the different columns I don't

341:37

really like this too much I typically

341:38

find myself using rows so we're trying

341:40

to calculate what is the count of these

341:43

job title shorts so I'm just going to

341:44

take that job title short again and put

341:46

it into the values and it automatically

341:50

Aggregates this by counts of that but

341:54

what happens if I don't want to do that

341:55

count aggregation well one way is to

341:58

come back into that values right here

342:00

and I'm going to just click it not right

342:02

click it just normal click it and then

342:04

go into value field settings and this

342:07

pop-up is going to come up first up is

342:09

the name of the column itself I actually

342:11

don't like this for of a name I'm just

342:12

going to rename this to job count under

342:15

here under the summarized values by tab

342:17

you can select a lot of different

342:20

aggregation methods we're going to stay

342:22

with count you can also change how you

342:25

show value as basically if we wanted to

342:27

do a percentage of some total or not

342:29

we're going to be jumping that in the

342:30

advaned lesson so stand by for that the

342:32

last thing to note with this is the

342:34

number format so I can come in here and

342:36

actually select in our case we have

342:39

thousand values so I like to use a th

342:41

separator along with zero decimal places

342:44

and then clicking okay to apply this all

342:47

it updates the formatting and the name

342:49

so we've going over rows columns and

342:51

values what happens if we want to then

342:53

filter let's say for only United States

342:56

jobs well I could drag something like

342:59

the job country column into filters and

343:02

right now it's selecting all you have

343:04

you see this pan come up right here and

343:06

from there here I can actually go

343:08

through and select something like the

343:10

United States click okay and now the

343:13

values as you can see they reduced and

343:15

are only United States value other type

343:17

of filterings I can do I can filter the

343:19

row itself so if I wanted to I could

343:22

select the different job titles that I

343:24

want to appear in this and click apply I

343:27

could also do something where let's say

343:29

I wanted only job title so we're going

343:30

to do a label filter and jobs that

343:33

contain the word data so I could just

343:36

type in here

343:37

data and whenever I filter it I get all

343:40

the different jobs that contain data

343:41

similarly I could also filter by this

343:44

job count here and that's by the values

343:47

filter so I'm going to remove this label

343:49

filters to start with and we can go back

343:52

in here in the values filter and we

343:54

could do something like hey we want to

343:55

get jobs that are only greater than

343:59

let's see here Cloud Engineers 33 I

344:01

don't want to see that anymore I get to

344:03

greater than 100 and it filters down but

344:06

we're not going to use any filters right

344:07

now so I'm going to one clear this

344:10

filter for the table and then also

344:13

remove this filter from filtering for

344:16

the United

344:19

States so let's get into taking this

344:22

analysis of step further and we're going

344:23

to want to now analyze the average

344:27

salary of these different job titles

344:29

while we're going through this we're

344:31

also going to be exploring the pivot

344:32

table analyze tab so a quick tour of

344:35

this tab first up over here on the left

344:37

is Pivot tables if I wanted to I could

344:40

go through and rename this i' probably

344:42

name this typically something similar to

344:44

what is my sheet name itself this case I

344:47

named it job count additionally inside

344:49

of here we have options which allows us

344:52

to do a lot of detailed control of how

344:55

we're building our pivot tables it's a

344:56

very Advanced feature I don't find

344:58

myself going into it quite often unless

345:00

I need to fine-tune the functionality of

345:02

it active field so that tells us

345:04

basically what's the active field

345:06

grouping is something we're going to go

345:07

into in the next lesson we actually go

345:09

and Performing groups of different job

345:11

titles slicers and timelines we're going

345:13

to be going into the last lesson on

345:15

pivot charts in order to basically use

345:17

these slicers and timelines to filter

345:19

data section is used to control our data

345:21

so I can click something like refresh or

345:24

refresh all it's going to refresh the

345:26

data that we have so in this case

345:28

remember business analyst is around

345:30

1,1 so if I go back to our data itself

345:34

and I find this entry on business

345:36

analyst and then and let's say that

345:37

that's not correct and I delete that out

345:39

of there whenever I come back to this

345:42

table itself it still says

345:46

1,1 what I have to do is well we've

345:49

updated the data so I have to well

345:51

refresh it now that I refreshed it it's

345:54

down to 1,000 I actually don't want to

345:56

remove that entry so I'm going to just

345:58

press contrl Z and bring that right back

346:02

and then also click refresh to make sure

346:03

it's up to date if I want to change the

346:05

data source or maybe the range I could

346:07

go into something like this of change

346:08

data source actions allow us to clear

346:12

select and even move a pivot table for

346:14

calculations they have things like

346:16

calculated fields and items but we're

346:18

going to get into measures and I feel

346:20

they're way more powerful so we're not

346:21

going to cover this much the last thing

346:23

to cover with this is over here on the

346:25

right hand side is the show sometimes

346:28

whenever you're navigating you'll click

346:30

into your pivot table and that pivot

346:31

table Fields pane won't pop up you can

346:33

also pan it on and off by clicking this

346:36

field list and if you didn't want

346:38

something like row labels at the top you

346:40

could just remove the field headers as

346:42

well so getting into that actual

346:43

analysis we want to analyze the salary

346:46

year average what is the average value

346:50

now I can't see all the different values

346:51

selected in here so I'm going to

346:52

actually going to go ahead and close

346:54

this paint up here to have a bigger view

346:55

anyway what it did was it did a sum of

346:58

the salary year average we don't really

347:02

want that we want to go to average and

347:05

I'll change this column name to to

347:07

average yearly salary now if you've been

347:10

following along since the basic chapter

347:12

you probably know that I prefer me

347:14

performing a median for this salary data

347:17

over an average but if you actually go

347:20

through this there's no median value for

347:23

this that doesn't mean you can't do

347:25

median in pivot tables you actually can

347:27

you can actually do even more advanced

347:28

stuff which we're going to get to in

347:31

chapter 8 and power pivot but for now

347:33

we're just going to stick to only

347:35

performing average for this I'm going to

347:37

click okay so the formatting on this is

347:39

all jacked up and we could go into that

347:42

field settings and adjust that or I can

347:44

actually go in as long as I have all the

347:46

values selected here I can select hey I

347:49

want to convert this to a currency and

347:52

that I don't want any decimal places and

347:54

it's going to format all the values and

347:57

I feel this is a little bit easier

347:58

because now actually if you go back and

348:01

in exploring the value field settings

348:03

inside of number format it actually

348:05

applied this custom formatting for me so

348:08

it knows to apply that since I applied

348:10

it to all the values that were visible

348:12

now since this is so easy I could also

348:14

do something like get the average of the

348:16

hourly salary once again it's doing the

348:19

sum of that and I don't want that I want

348:22

the average itself and I can change that

348:25

column Name by just going in here and

348:27

typing in average hourly salary

348:31

inspecting the value field setting it

348:33

also updates inside of here and I'm

348:35

going to go ahead and adjust the

348:36

formatting as well changes to a currency

348:38

with two decimal

348:42

places so let's get into actually

348:44

cleaning how this table looks up and we

348:46

can go and do this by going into the

348:49

design tab now I'm going to start over

348:51

here on the right in pivot table Styles

348:52

and we can actually change what it may

348:55

look like in this case I sort of like

348:57

this one right here the simplistic look

349:00

I can also change things like column

349:02

headers which I like the formatting on

349:04

it or whether I want banded rows or

349:07

banded columns in my case I kind of like

349:09

the banded rows we'll go with that last

349:11

portion is around the layout if you

349:13

notice down here we have this grand

349:15

total over here this is a grand total

349:18

based on well the column values it's

349:21

adding up all the values in the column

349:22

so this is on for the column so if I

349:24

wanted to turn it off for rows and

349:26

columns I could come up here and

349:27

actually do that I kind of like this so

349:29

we're going to leave it on I could also

349:30

turn on on for the rows and columns but

349:34

in this case because we're doing

349:35

different aggregation method so a count

349:37

here and an average here it's not

349:40

necessarily going to do anything over

349:42

here for the row grand total whereas for

349:45

something like the columns gram total

349:48

that knows that hey for a job count I

349:50

probably need the total count for the

349:52

average I probably need an average and

349:54

that's what it does for both of these

349:56

there's some additional ones up here on

349:57

adjusting the report layout adjusting

349:59

for blank rolls and then also subtitles

350:01

we'll be exploring that as we go along

350:02

as we build out more complex pivot

350:04

tables

350:07

so let's now get into that final

350:09

analysis and we're going to be creating

350:11

basically this pivot table that we did

350:13

previously with formulas and functions

350:17

so what we'll need to do or think of

350:18

right we're going to need the job title

350:20

short in the rows and we're going to

350:23

need the month the job posted months in

350:27

the columns and then we'll need to

350:28

aggregate this by count for the values

350:31

now I can navigate back to the data Tab

350:33

and once again go to insert pivot table

350:36

if you notice here it says from table or

350:38

range so that's the really good thing if

350:41

we actually convert this to a table

350:43

we'll now be able to once we do this

350:45

press okay and rename this to something

350:48

like jobs now we can really be anywhere

350:51

in this workbook in this case I created

350:53

a new sheet I go hey insert from table

350:57

arrange specifically I want to do a

350:58

table of jobs and we want to do this

351:01

existing worksheet in A1 and all the

351:05

values from that jobs table are now here

351:08

so we know we need the job title short

351:10

along the rows but then we need the job

351:13

posted month across the top which right

351:17

now we have a date we could put the date

351:19

into the columns but we get this air

351:22

Message hey you cannot place a field

351:23

that has more than well 16,000 different

351:26

values for it so we're not going to do

351:28

that also before we forget I'm going to

351:30

rename the sheet to monthly count anyway

351:32

we need a monthly value here so what

351:36

going to have to do is good thing about

351:38

the table itself is now that we've

351:41

created this as a table I know next to

351:43

this job posted date colum I want to

351:45

insert in a column called job posted

351:48

month and for this we'll just use that

351:51

text function that we already know using

351:53

the value of job posted date and then

351:56

for the format we know we want three

351:59

lowercase M to get the month itself it's

352:02

going to fill all the way down okay so

352:03

now we have job posted month going back

352:05

to our pivot table itself remember we're

352:08

not going to see job posted month in

352:11

here until we actually go back into

352:14

pivot table to analyze and click

352:17

refresh now job posted month is inside

352:21

of here and conveniently it's also in

352:23

the correct order now this thing is

352:25

completely blank right now we need to

352:27

actually add what values we want so I'm

352:28

going to drag job title short into

352:31

values and it's going to do a count

352:33

notice here we do have column value Val

352:37

which go up and down and then the row

352:38

values itself so we can see what the

352:41

count of business analyst is around 101

352:44

I'm not really a fan of these things

352:46

that say row and column labels I'm going

352:48

so I'm going to toggle off field headers

352:49

to make this look a little bit better

352:51

and I'm also going to change the name of

352:53

this to monthly job count so bam this is

352:56

looking good and we compare it to our

352:58

basically non-pa table just to make sure

353:00

that our values are correct we can see

353:01

we have 982 for data analyst come over

353:04

over to data analyst we have 982 all the

353:07

last thing we want to do is actually

353:09

filter this down and better sort our

353:11

values specifically I'm curious about

353:13

roles in the United States so I'm going

353:15

to drag that job country over here and

353:18

select United States from here to apply

353:20

to it additionally I care about the most

353:22

important jobs at the top and the least

353:24

important at the bottom mainly by this

353:26

grand total right here and so what I can

353:28

do is I can sort it by the grand total

353:30

but if you notice I remove that that

353:32

filter button right here whenever I

353:34

actually remove the field headers so I

353:36

can also go in Instead rightclick This

353:40

Grand the value inside of grand total

353:42

and I can say sort from in our case

353:45

largest to smallest so I feel like that

353:48

makes it a lot more convenient alsoo

353:50

sort additionally I'm noticing the

353:51

formatting isn't correct for this I'm

353:54

going to put in that comma separator and

353:55

then remove the two decimal places

353:58

similarly not only did we sort by the

353:59

grand total let's say I only wanted

354:01

maybe the top six of these right here I

354:04

could rightclick any of these job titles

354:06

right here and then go into filter in

354:09

this case I'm going to go top 10 instead

354:12

I'm going to select top six press okay

354:15

now that we have this all sorted I can

354:17

once again go into that design tab

354:19

change the grand totals we're going to

354:20

turn it on for columns only and Bam now

354:25

we have basically the same pivot table

354:28

that we had before with our values or

354:30

using formulas but instead now with

354:32

pivot tables and this is a lot more

354:35

customizable all right all right it's

354:36

your turn now to get your hands dirty

354:38

with some practice problems and

354:39

exploring how to make some different

354:41

pivot tables in the next lesson we're

354:43

going to go deeper with pivot tables

354:45

looking at things like grouping

354:47

hierarchy and how we can show different

354:49

values as with that I'll see you in the

354:51

next

354:55

one so let's get into some Advanced

354:58

pivot table features and for this lesson

355:01

and actually for everything in advanced

355:03

chapter we're going to be sticking with

355:05

that salary data set of over 30,000 rows

355:09

in order to actually analyze for this so

355:11

I'm not going to be calling it out

355:13

really any further into other lessons or

355:16

chapters the first thing we're going to

355:17

focus on is hierarchy which allows us to

355:20

look at things like we want to aggregate

355:23

not only the job title itself but also

355:25

by the country so what job titles are

355:27

within a country and then look at

355:29

specific values there for say like the

355:31

salary next we're going to move into

355:33

grouping focusing first on automatic

355:35

grouping basically using that job posted

355:37

date column to automatically aggregate

355:40

by year month and whatnot and from there

355:43

we'll then shift into some manual

355:45

grouping we'll be able to create groups

355:48

of different job titles and basically

355:50

break out whether we want to look at

355:52

maybe senior roles such as senior data

355:54

analyst senior data engineers and

355:55

compare them to just normal data nerd

355:58

roles such as data analyst or data

355:59

Engineers with this we're also going to

356:01

dive deep into understanding a deeper

356:04

method to analyze maybe percentages of

356:07

totals or percentages of grand totals

356:09

when analyzing these type of groups for

356:12

this you can continue working with that

356:14

workbook you were working with on the

356:16

last lesson if you've did everything you

356:17

did there or you can just open the pivot

356:21

table intro for this lesson once again

356:25

as a reminder the solution is going to

356:27

be in pivot table Advanced we don't want

356:30

to open that just yet because it could

356:31

mess up what we're doing here if I could

356:34

it is so we have four different sheets

356:35

that we cre created with this I only

356:37

really care about the data tab right now

356:38

so I'm actually going to select all

356:40

these other ones by holding control and

356:42

then right clicking it to hi

356:47

them so let's actually look what a

356:50

hierarchy actually creates I'm going to

356:52

go in and insert a pivot table from

356:54

table arrange remember we're using that

356:56

table of jobs you should have named the

356:59

table that in order for this to work and

357:00

we're going to insert it in a new

357:01

worksheet I'm going to move this over

357:03

and I'm also going to create uh call

357:05

this sheet hierarchy so for this we want

357:08

to look at the salaries for job titles

357:11

in a certain country so we're going to

357:14

start by dragging that job country over

357:16

to Rose and right now there's no

357:19

hierarchy but if I drag job title short

357:23

into the rows as well when we close this

357:25

tab up here we can see that now we have

357:27

two values in here and how we have

357:30

values underneath here we've now created

357:33

a hierarchy so Albania is basically the

357:36

parent or the top of this and then we

357:39

have data analyst data scientist senior

357:41

data scientist notice there's only three

357:42

values here and that's because an

357:44

Albania sort of a smaller country they

357:46

only have three types of jobs there at

357:48

least in the data set now we want to

357:49

look at salary for this so I'm going to

357:51

drag the salary your average into the

357:53

values it's going to do a sum once again

357:56

going into value field settings I'm

357:58

going to change this to average rename

358:00

the title to salary year average and

358:02

then changing the number format to

358:04

currency with zero decimal places

358:07

pressing okay for all this bam now I'm

358:09

also curious by this how many jobs we

358:13

actually have with a salary value this

358:16

just sort of an add-on so I'm going to

358:17

drag that salary year average over going

358:20

into the value field settings I'm going

358:21

to do a count of this and we'll call

358:25

this job count click okay so now we get

358:28

more of a relative idea of how many jobs

358:30

are so in Albania we have well only five

358:32

job postings so now I want to get into

358:35

actually seeing what countries have the

358:38

highest pay now as a refresher you can

358:41

come in here and select the dropdown and

358:43

we could either select how we're going

358:45

to filter the row labels or filter the

358:48

value labels but remember we want to

358:51

sort them and right now this is only the

358:52

or option to sort a toz or Za to a for

358:55

those row labels instead I can just

358:58

click make sure I'm clicking the Sal

359:00

your average because that's where I care

359:01

about I can rightclick it and from there

359:03

go to sort in this case sort large just

359:06

the smallest and what it did is it

359:08

sorted the values well within each of

359:11

these it still kept this kept the

359:13

countries in alphabetic order instead

359:15

what I can do is Select this cell for

359:17

the countries because I want to sort the

359:19

country's highest to lowest and then I

359:22

can sort largest to smallest as well now

359:25

this is pretty neat because now we can

359:26

see things like Belarus Russia Bahamas I

359:29

got to go down there have some of the

359:31

highest salaries by country and then

359:32

what those are based on the different

359:34

job titles there now some sometimes I

359:36

find reading this somewhat difficult in

359:39

this manner that it's laid out here I'm

359:40

going to show you how you can actually

359:41

change this so going back into the

359:43

design tab remember we had this reports

359:46

layout that we sort of breezed over last

359:49

right now it's in this show in compact

359:51

form we can actually change this to

359:53

something like show in outline form and

359:55

it will basically shift this over and

359:58

have this hierarchy basically in two

360:00

separate columns it also makes it nice

360:02

that you can actually a little bit

360:03

easier to sort with another method is

360:06

show in tabular form so now it basically

360:09

crunches it up and I actually like this

360:11

one even better and it's still breaking

360:13

out the job country and job title short

360:15

into two different columns but now it's

360:17

actually aggregated to less line so I

360:19

can actually see more data on here now

360:21

this is definitely a form that I'd like

360:23

if I want to hand over on boss and even

360:24

if I wanted to convert this even further

360:26

to what is this repeat all item labels

360:29

so now I could if I wanted to actually

360:31

copy and paste this into its own table

360:34

and analyze further at least now I have

360:36

like Bahamas with the software data

360:38

engineer not software data engineer I

360:40

mean software engineer or senior data

360:43

engineer anyway you may have noticed

360:44

there are some blank values in here and

360:47

that's because it has an Associated

360:49

hourly salary but not yearly what i'

360:53

need to do is actually apply a value

360:55

filter because it's a value so I come in

360:58

here and click to drop down go to Value

361:00

F filters and then maybe put something

361:02

like greater than we'll put zero and now

361:06

those values will

361:10

disappear next analysis we're going to

361:12

do is a count by the job month but we're

361:16

not going to use the this job posted

361:18

month column that we created in the last

361:20

lesson instead we're going to use

361:22

automatic grouping for this so we'll go

361:25

ahead insert in a pivot table we'll

361:28

insert into a new sheet and we'll call

361:31

this group automatic I'll go ahead and

361:34

move that to the very end okay so what

361:36

I'm going to do is I'm going to take the

361:37

job posted date remember it's a bunch of

361:40

dates and I'm going to throw it into the

361:42

rows and this is going to get into some

361:44

aggregation it's going to take a little

361:45

bit to load my computer's not even

361:47

loaded yet but it's about 15 seconds

361:49

later and it is now available if you

361:52

notice now we have this hierarchy of

361:56

this grouping and I can now dive into in

362:00

this case January and then one Jan here

362:04

and then diving in further we can dive

362:06

into specific times of job postings

362:09

going to go ahead and close this up if

362:11

we actually investigate over here inside

362:14

of here we can see that after I dragged

362:16

that job posted date over it basically

362:18

created a month days and then the date

362:22

itself which is actually a date time but

362:25

anyway three different values its own

362:28

hierarchy with this automatic grouping

362:31

and so now I can go in and do something

362:34

like drag the job title short into here

362:36

to get the job count I'm going to change

362:38

this

362:39

to job count also go in and actually

362:43

adjust the formatting but now whenever

362:47

actually go into each one of these

362:48

hierarchies and look in we can see how

362:50

many job postings were having on a daily

362:53

bra basis and how many were happening at

362:56

a certain date time so now let's say I

362:58

wanted to dive deeper to understanding

363:01

maybe why July had such a high number

363:04

compared to all the other months one I

363:07

could double click it or I can just

363:08

rightclick it and go to show details

363:12

this is going to show well the details

363:15

and if we actually go over to the job

363:17

posted date column it's going to have

363:20

all the values for July inside of here

363:24

so this is a pretty unique way to get

363:27

into diving deep and showing the details

363:30

of what is the data being used to

363:33

perform these aggregations and also

363:35

double check your

363:38

work now we're going to get into manual

363:40

grouping specifically we're going to

363:42

create this where we actually go through

363:45

and aggregate based on the job titles

363:49

itself assigning it into well a group so

363:51

put data analyst scientists and data

363:53

Engineers into Data nerds senior RS into

363:55

senior data nerds and then these guys

363:57

into other data nerds so we're going to

363:59

create a pivot table for this go in and

364:02

select okay using the jobs table and

364:04

we're going to be grouping the job title

364:06

short so I'll drag that into the rows

364:08

for the time being and we'll just start

364:11

by grouping just the data nerds so I'm

364:14

going to just select one of these and

364:16

then hold down control and then also

364:17

select data engineer and then also data

364:19

scientist then I'm going to rightclick

364:21

it and select group the other way I

364:24

could also do this is go into pivot

364:26

table analyze and select group selection

364:30

the next one on want to group are senior

364:32

roles so I'm going to just select all

364:33

the different senior roles conveniently

364:35

they're all right next to each other

364:37

then I'm going right click it and go to

364:39

group so now they're the own group the

364:41

only thing left is getting the rest of

364:43

these I'm actually going have to control

364:45

these select these and then these as

364:47

well and then from there we'll group

364:51

that for group one I'm just going to

364:52

select it come up to the formula bar and

364:55

type in data nerds name group two to

364:58

senior data nerds and then group three

365:01

to other data nerds also going to zoom

365:04

in a little bit to get a little bit

365:05

closer

365:09

now that we have all these grouped let's

365:10

actually dive into performing a

365:13

basically deeper anal analysis on this

365:16

to look at how or what percentages these

365:19

make up of all the job titles and also

365:23

of their respective groups specifically

365:26

we're going to be looking at on going

365:27

job the job title short over here we're

365:29

looking at the count and how the counts

365:32

of those jobs are going to be of the

365:34

percentages anyway I'm going change this

365:36

to job count along with going through

365:39

and updating the formatting to use a

365:41

comma and no decimal places so for this

365:44

I still wanted to use that basically

365:46

count of the job title short so with

365:49

these counts we're going to do the

365:50

percentages so I'm still going to use

365:51

that job title short column going to

365:53

drag it into the values we did a count

365:57

but now let's actually go in inside of

365:59

the value field setting remember we got

366:02

to that show value as and inside of here

366:06

we can have different values percent of

366:09

grand total percent of column total

366:11

percent of row total we're just going to

366:12

go percent of grand total to start press

366:15

okay and Bam this is now showing us the

366:19

percent of the grand total now I'm not

366:22

liking how this is ordered right now I'm

366:24

actually going to I'm going to sort this

366:26

selecting one of the values inside of

366:28

the job count from sort it from largest

366:32

to smallest and then also I want to do

366:34

the actual grand total itself sort

366:36

largest to smallest anyway we updated

366:39

this to percent of grand total we need

366:41

to update the title to specify perent of

366:44

grand total and so we can see that data

366:47

nerds for their parent are taking up

366:50

about 76% almost 34 of the jobs are that

366:54

and individually we can see that data

366:56

analysts are nearly 30% of that whereas

366:59

we get down to the other data nerds

367:00

they're only taking up a very small

367:02

percentage now what happens if we want

367:04

to see so what is data analyst of the

367:08

actual parent or what is the cloud

367:09

engineer of the parent other data nerds

367:13

well I can drag that job title short

367:15

into the values again it's aggregating

367:17

by count but I can go in and this time

367:20

I'm actually just going to rightclick it

367:23

and we can have this show value as I'm

367:25

going to use that instead we can do that

367:27

percent of grand total but instead I'm

367:29

going to come down here to percent of

367:32

parent total and in our case it's asking

367:35

us what is the parent now you didn't we

367:38

haven't gone over this but it actually

367:40

recreated that that grouping as job

367:42

title short two so I'm going to click

367:44

okay we don't want to do the job title

367:46

short that's not the parent job child

367:49

short to and that's you can see it

367:52

actually down here job title short too

367:54

it created inside the rows but anyway

367:56

getting back to the parent now it's

367:58

showing the percent that it takes to

368:01

make the parent and then obviously the

368:02

parent is at 100% so I'm going to rename

368:06

this one percent of parent now we just

368:09

looked at percent of grand total and

368:11

percent of parent but the show value as

368:14

has a lot of different other ones you

368:16

can also do in here if I wanted to I can

368:18

even do something like rank largest to

368:22

smallest once again it's asking us do we

368:24

want to rank part of the parent or part

368:26

of the job tile short I want to rank

368:27

part of job tile short and it will show

368:29

its individual rankings underneath each

368:32

from highest to lowest I'm going to go

368:34

ahead and undo this I don't want to know

368:35

necessarily keep that one more note

368:37

before we go for those that purchase the

368:38

course practice problems and also note I

368:41

also go into calculated items and field

368:45

and have its own little worksheet for

368:47

you to follow along and try out

368:49

calculated items and field I didn't

368:51

necessarily include it in this lesson

368:53

because I felt that it wasn't a very

368:56

powerful feature I instead use like

368:58

using measures instead which we going to

369:00

cover in the power pivot chapter but if

369:02

you're interested about it I have

369:04

content on it in our notes and those

369:06

calculated field and items is underneath

369:09

that pivot table analyze tab in here on

369:12

calculated field and items we're not

369:13

going to be covering it outside of those

369:15

notes that you can follow along and do

369:16

your own self-study with it all right

369:18

you have some practice problems now go

369:19

through and get more and familiar with

369:22

these Advanced features and pivot tables

369:24

because in the next lesson we're going

369:25

to be diving into actually making charts

369:28

out of these pivot tables using pivot

369:29

charts with that see you in the next one

369:36

moving now into pivot charts so we did a

369:39

lot of work already in analyzing things

369:42

with pivot tables we're going to take it

369:44

now to Next Level pivot charts

369:46

specifically we're going to be looking

369:47

at first what is the average salary by a

369:50

job title next we'll be looking at which

369:53

job has the highest percent of demand

369:55

and then finally lastly we'll be looking

369:58

at how basically how are jobs trending

370:01

over time we're going to be building all

370:03

these charts using pivot tables

370:05

additionally we're going to include the

370:07

features of slicers and also timelines

370:12

based on what chart we're using in order

370:14

to be able to filter down and more

370:16

easily make our graphs more interactive

370:19

as usual in the advanced chapters I want

370:20

you to starting with the Excel workbook

370:24

from the last lesson so pivot tables

370:26

advance and if you want to see the

370:27

examples or the final answer you could

370:29

go to Pivot charts for this we're not

370:31

going to be using the hierarchy or that

370:34

show Det tail tab so I'm going to go

370:37

ahead and hide

370:40

those so let's create this first chart

370:43

to analyze what is the top paying job in

370:45

data science for this I'm going to just

370:47

create a new pivot table for this using

370:50

that jobs table and we're going to be

370:52

aggregating by job title short in the

370:54

rows and then the salary your average in

370:57

the values and for this we want to

370:59

summarize values we don't want to do the

371:01

sum we're going to do the average I did

371:03

this by right clicking it but we do to

371:05

have these all formatted correctly in

371:08

currency with no decimal places and I'll

371:10

update the title as well to average

371:13

yearly salary so in order to insert in

371:17

this pivot chart we're going to go to

371:19

the insert Tab and we're going to come

371:21

here to Pivot chart there's only one

371:23

option right now because we're selected

371:25

on a pivot table and that's a pivot

371:27

chart itself so I'll go ahead and insert

371:28

it with this there's no recommended

371:30

charts but I know I want a column chart

371:33

so we're going to go with that and if

371:35

charts aren't that different from

371:36

regular charts I can come up here select

371:38

this plus sign I can remove things like

371:41

the legend I don't really need that and

371:43

then I can change things like the title

371:45

by just double clicking this to

371:47

something like what is the top paying

371:49

job in data science now you may notice

371:51

these pivot charts are a little bit

371:52

different as they have these field

371:54

buttons on here that basically allow you

371:56

to with the chart itself go in and

371:58

filter it this is really convenient if

372:00

them say this chart was in a different

372:01

page anyway I want to have these salary

372:05

sorted from highest to lowest so I can

372:08

come into here and you know we can go

372:10

sort A to Z or Z to A and you can change

372:14

it around we want to actually sort from

372:16

highest to lowest so I can come in here

372:18

under more sort options and I can change

372:21

this from the job title short column to

372:23

that average yearly salary column and we

372:27

want it to be descending and we'll click

372:29

okay so bam now we have our salary

372:33

oriented from high to low with our

372:35

values if you don't like these field

372:38

buttons right here you can come in and

372:40

right click it and go hide all field

372:43

buttons if you want but if you want to

372:45

get them back you have to come back

372:46

underneath the pivot chart analyze Tab

372:49

and select field buttons and uncollect

372:52

this hide

372:55

all the next thing to analyze is which

372:58

job has the highest percentage of demand

373:01

we're going to use that percentage of

373:02

grand total column before and we're

373:04

going to be adding a little twist with

373:05

this one as we're going to be also

373:07

building in some slicers so we can slice

373:09

the data for what we want so back inside

373:12

the work should you should be working in

373:13

so we want the percent of grand total

373:15

only so I'm going to move out count

373:18

percent of parent and also that rank

373:20

count to only have what we want next

373:22

move into getting a pivot chart Built

373:24

For This and once again I'm going to be

373:26

using that column chart I'll go ahead

373:28

and insert that I'm going rename this to

373:30

which job has the highest percentage

373:32

once again I don't really care about

373:34

that Legend now I want my basically

373:38

target audience whoever I give this to

373:40

to have control to be able to select

373:42

which group they can filter for whether

373:46

that's data nerds senior data nerds or

373:47

other data nerds so in order to control

373:50

that I'm going to first zoom out we're

373:52

going to insert some slicers for this so

373:55

if we come into the pivot chart analyze

373:57

tab we can have with this chart selected

374:01

I'm going to go into insert slicer and

374:05

we're going to do it for remember that

374:06

that group from last time is actually

374:08

job title short 2 and also we're going

374:10

to filter this one also by country I'm

374:13

going to click okay they're going to pop

374:15

up here on top of this I don't really

374:16

like this I'm going to drag it over and

374:18

I'm going to fix the formatting real

374:19

quick so now with these slicers I can

374:23

make it a lot easier for somebody using

374:25

this to come in and say hey I only want

374:27

to look at data nerds or I want to look

374:29

at other data nerds and see what their

374:33

appropriate percentage is when you click

374:35

on a slicer you will notice that this

374:37

slicer tab comes up there's some

374:39

different formatting options the one

374:41

thing that I Define myself do changing

374:43

is the appropriate label or the slicer

374:46

caption in this case I would rename this

374:49

one to something like job group and then

374:51

for the job country I would just rename

374:53

this to Country and you can see they

374:56

update appropriately here for it as a

374:58

refresher right if you wanted to select

375:00

multiple different op options I would

375:03

select this multi select right here and

375:06

then with that enabled I can then select

375:07

data nerds and also senior data

375:12

nerds the last visualization we're going

375:14

to be building with this it's a line

375:16

chart looking at how jobs are trending

375:18

out of time using that previous pivot

375:20

table we made on the job count for this

375:22

one we're going to be using a timeline

375:24

filter to be able to select down to

375:26

maybe a certain quarter or month so back

375:29

in the workbook that we're working in

375:30

I'm in this group automatic sheet we

375:32

want to create a pivot chart so I go to

375:34

insert and into pivot chart and for this

375:37

one we want a line so I'm going to go

375:40

ahead and insert that I'm going to give

375:41

this appropriate title of how are jobs

375:44

trending over time additionally I'm

375:47

going to remove that Legend and I want

375:49

to add a trend line to it now you notice

375:52

by this one the actual field values for

375:55

this you have multiple different ones

375:57

here remember it did that automatic

375:59

grouping in the last lesson so you have

376:01

not only months to filter by days and

376:03

also that job posted date so a lot more

376:05

values here now to add a timeline for

376:07

this I'm going to go up to Pivot chart

376:09

analyze and I'm going to go into insert

376:12

timeline there's only one value that's

376:13

going to be available for this job

376:15

posted date and right now if I expand

376:18

this all the way out we have all the

376:20

different months that are available I'm

376:22

going go ahead and close this up right

376:24

below it so if I wanted to filter by a

376:27

specific month I could be like hey I

376:29

want from February to November in this

376:32

case October I actually need to select

376:34

February I'm holding down my key for

376:36

this and then dragging to November

376:38

anyway I can also change this with this

376:40

filter not only months but also quarters

376:44

and even something like years I prefer I

376:47

typically analyze things in quarters so

376:49

we're going to do it that manner and I'm

376:51

also going to shift it up here to the

376:53

right hand side similar to slicers if I

376:55

have the timeline selected I can come up

376:58

here and actually change the name in

377:00

this case I'm going to change it to date

377:02

I could also change thing like

377:04

formatting or even things like color now

377:07

one thing to note with this with what I

377:09

have selected here it's only going to

377:11

filter what I have the chart set up to

377:14

or what I actually created the timeline

377:16

while the pivot chart was selected so

377:17

let's say I came into here and I wanted

377:19

to look at in our case just data nerds

377:22

and then also go into looking at the

377:25

counts themselves this isn't necessarily

377:27

going to update for that those slicers

377:30

aren't connected to other charts but you

377:33

can change it to do that so in this case

377:35

I could select something like the pivot

377:37

table itself going into pivot table

377:39

analyze and then here under filters

377:42

where you can create things like slicers

377:44

and timelines which we did in the pivot

377:46

chart anyway they have this thing called

377:47

filter connections and I'm going to

377:50

expand this out so we can actually see

377:52

it and right now we're saying that well

377:54

for pivot table 3 as we can see up here

377:56

probably need to give these even better

377:58

names only the date is actually

378:02

connected to this if I wanted to connect

378:03

the other ones such as country or job

378:05

group I'd have to select them and press

378:08

okay now I don't know if you noticed

378:10

that but it actually adjusted these

378:12

values actually decrease because I have

378:15

less values selected here whereas if I

378:17

actually select more all of these going

378:20

on this is going to increase the values

378:24

anyway that's sort of hard to see let's

378:25

actually show this by with uh sheet one

378:29

which actually should be something like

378:31

top paying jobs and in this case I can

378:35

go into pivot chart analyze into filter

378:39

connections and this is going to show us

378:41

based on pivot table 7 which is this one

378:44

right here I should have renamed these

378:46

there's no different slicers or

378:48

timelines attached to it so I can

378:49

actually select all of these and apply

378:52

it to this one and now when I go to our

378:55

grouping right here right so we had all

378:57

of them selected if I want to just look

378:59

at data nerds here so I can see the

379:01

percentages of data analyst dat engineer

379:02

and data scientist I can see what their

379:05

salaries are for it and then also I can

379:08

see their counts for those as well so

379:11

this is definitely a useful feature if

379:12

you're looking to link charts or

379:15

specifically pivot tables that are not

379:17

necessarily

379:18

connected all right now it's your turn

379:20

to get more familiar with using pivot

379:22

charts we have some practice problems

379:24

that you go through and actually

379:25

understand more about how to use them

379:27

with that in the next lesson we're going

379:29

to be jumping into well the next chapter

379:32

on Advanced Data analysis and using some

379:34

pretty unique and pretty complicated

379:37

features in order to analyze data so

379:39

with that I'll see you in that

379:45

one welcome to this chapter on Advanced

379:48

Data analysis and this entire chapter is

379:51

really focused on using addins which are

379:55

basically programs that people have

379:56

built to incorporate into Excel to do

379:59

very unique and specific tasks because

380:03

of that going from less lesson to lesson

380:05

we're not going to necessarily be

380:07

building on each other as we go through

380:09

these lessons every lesson is going to

380:10

be sort of its own unique sort of

380:12

Learning Journey about a specific

380:14

feature or features to start with this

380:17

lesson we're looking at just enabling

380:19

the add-ins and looking at some basic

380:21

ones such as what if analysis and we're

380:24

going to get more into it in a second

380:25

but we're going to be focused on looking

380:27

at if weed three different job offers

380:30

which one should we actually take in the

380:32

next lesson we're going to be continuing

380:34

on with what analysis focusing on data

380:36

tables and this shows us how values are

380:38

going to be changing based on one or

380:41

multiple variables and then finally the

380:43

third lesson is on an addin called

380:46

analysis tool pack that provides us

380:49

access to a lot of different statistical

380:52

analysis that we can just easily select

380:55

what type of analysis want to perform

380:57

and it does all the analysis for for us

380:59

and provides it in a sheet pretty neat

381:02

anyway getting into this lesson we're

381:04

going to start by first enabling these

381:06

add-ins so that way you have it and then

381:07

from there we're going to move into our

381:09

first somewhat simple example

381:11

forecasting what's going to happen into

381:13

the future specifically we're going to

381:15

look in at our past job postings and try

381:17

to predict what's going to happen in the

381:19

future from there we're going to be

381:20

moving into what if analysis and for

381:24

this we're going to have a scenario

381:26

where we have three job offers and we're

381:28

trying to find what is the most optimal

381:30

one we're going to use things like

381:32

scenario manager to go through and

381:34

automatically calculate what it should

381:36

be for those three different job offers

381:39

and then let's say we need to actually

381:40

negotiate one of those job offers and we

381:43

want to match another we can use solver

381:45

or goalkeeper and both of these have

381:48

both unique different features of them

381:50

that we're going to dive into to allow

381:52

us to adjust what we could potentially

381:54

negotiate for better job offers one

381:57

quick reminder on which versions of

381:59

excel will support this chapter on

382:02

Advanced Data analysis all of them will

382:04

with the exception of Microsoft online

382:07

it doesn't have the ability to add in

382:10

these specific addins but you're on Mac

382:13

or the windows version you're going to

382:14

be completely fine so for this we're

382:16

going to be working inside of the

382:18

analysis addins workbook I know it said

382:21

previously you need to work with the

382:22

previous workbook from the previous

382:23

lesson but this chapter in general

382:26

doesn't build on anything it has

382:27

everything you need within the workbook

382:30

so you're going to be fine with this

382:31

anyway we just need two sheets from this

382:33

forecast original and what if analysis

382:35

all the others are just the results that

382:37

we're going to be getting and feel free

382:39

to go through and select the sheets that

382:41

we're not using so these four in this

382:42

case and hide them so that way we only

382:46

have the two sheets of forecast original

382:48

and what if

382:52

analysis so before we enable addins I

382:54

think you need to know what are exactly

382:57

Excel addins here I am in perplexity a

383:00

and I asked the question and it goes

383:02

into to specify What It Is by saying

383:04

that basically interacts with Excel

383:06

objects and data and it will add custom

383:09

ribbon buttons or menu items and thus

383:12

providing custom functions now this is a

383:14

little technical but there are three

383:16

different type of addins they have web

383:18

Excel and com add-ins today we're going

383:20

to be importing in Excel addins which

383:23

are actually created using something

383:24

like VBA anyway the most popular Excel

383:27

add-ins are things like solver power

383:29

pivot power query you don't necessarily

383:31

have to add in unless it's not included

383:33

and then also things like analysis tool

383:35

pack which we're going to get to in that

383:36

third lesson all right enough on the

383:38

history lesson let's actually get into

383:39

enabling your addins if you go to the

383:41

data tab right now you'll probably see

383:43

that you have this forecast section so

383:45

you do have what if analysis available

383:47

but you don't have anything ex else over

383:50

here right now it's um well usually

383:52

blank but we're going to add to it so

383:53

I'm going to go into file and then from

383:55

there it's hidden but under more I'm

383:57

going to go to options on the menu on

383:59

the left hand side I'm going to go into

384:01

addins and this menu right here tells

384:03

you what your active application addins

384:06

are right now I have no active

384:07

applications and then your inactive

384:10

application addins so I do have access

384:12

to all these different ones right here

384:14

so I want to enable them specifically I

384:17

want this analysis tool pack and then

384:19

well the one we're going to use in this

384:20

lesson solver so um on manage I have

384:24

Excel addins that's the one that I want

384:25

to actually use for this I'm going to

384:27

click go and now we need to enable which

384:29

ones we're going to use so analysis tool

384:31

pack for the third lesson and solver for

384:33

this one from there I'm going to click

384:35

okay and now over here on the right hand

384:38

side we have analysis popup data

384:40

analysis which is the analysis tool pack

384:42

and then solver is the solver

384:47

added so let's actually get into

384:50

forecasting specifically looking at what

384:52

we expect job postings it to be next

384:55

year and right here in the forecast

384:57

original sheet I have date and then also

385:00

the job count and this goes all the way

385:03

for or this is all the data for 2023

385:07

anyway this example is going to show the

385:08

custom features that we really can do

385:11

with some of these add-ins and also

385:12

built-in features so I can select the

385:15

date and job count column and then for

385:18

this we're going to go into the forecast

385:20

and specifically to forecast sheet in

385:23

this it plots in blue what are our

385:26

values that we currently have for

385:28

basically 2023 and then from there it

385:31

plots into the future using this orange

385:33

I can toggle this between this a line

385:35

chart and also a column chart but I'm

385:38

not really finding the column chart that

385:39

useful It's Time series data so I'm

385:41

going to go back to that line chart the

385:43

other major thing I control is the

385:44

forecast end date so if I wanted to only

385:46

do maybe two months I could change this

385:49

instead to end in March additionally

385:52

have hidden underneath this drop down of

385:54

options the ability to go in and

385:57

actually change other things like

385:59

confidence interval and seasonality and

386:01

things like that right now it's

386:03

automatic set it up to basically detect

386:06

automatically and seasonality is as you

386:08

notice in this data it goes up and down

386:11

up and down up and down it has a

386:12

seasonality to it basically every single

386:15

week there's more postings during the

386:16

week and on the weekend there's less as

386:19

expected so this seasonality is carried

386:23

out into the predicted data as you can

386:25

see here because it's still in the

386:27

orange actually goes up and down anyway

386:30

going to close this this is great I'm

386:31

going to click create in this new sheet

386:34

it automatically has this popup here

386:36

that says this table contains a copy of

386:38

your data with additional forecast of

386:40

values at the end you can manually edit

386:42

the forecasting formulas in the sheet or

386:43

return to the original data to create a

386:45

different forecast worksheet okay great

386:46

got it I'm going to zoom out a little

386:48

bit and what this table did is it still

386:50

kept that date and job count column but

386:52

it also built out three other columns to

386:55

actually look at scroll all the way down

386:58

what the forecasted would be a lower

387:01

confidence band and then an upper

387:02

confidence band and then looking at the

387:05

actual chart that it provides we can see

387:08

this where this darker orange color is

387:11

what The Forecastle band is this is the

387:13

upper band and then this is the lower

387:14

band anyway that's pretty cool that I

387:16

could generate this all by just clicking

387:19

a single button of forecast

387:23

sheet all right now we're going to move

387:25

into wh if analysis and we click this wh

387:28

if analysis we have three different

387:30

things here we have scenario manager

387:32

goal seeker and data table

387:34

for this one we're going to start with

387:36

scenario manager but let's first go over

387:38

what the data is here in the sheet that

387:40

we're trying to basically trying to

387:42

calculate first let's focus on these

387:44

columns B and C this is a if you will

387:47

dashboard or calculator that I built so

387:49

I can put into here a base salary a

387:52

bonus rate and then an annual raise

387:54

amount and it will calculate it so let's

387:56

say our base salary is 12,000 I can put

387:59

that into here assuming the same 10% and

388:01

1.5% it's going to automatically update

388:03

for this over here on the right hand

388:05

side in E through H over here we have

388:09

three different job offers that we

388:12

received and they consist of the base

388:14

salary the bonus rate and the annual

388:17

raise underneath here this fourth or

388:19

fifth row if you will this is

388:21

constraints that we're going to use

388:22

later on I would just ignore this right

388:24

now so what's going on down here in the

388:26

result cell well what we're doing is

388:29

we're

388:30

calculating what the expected salary is

388:34

for year zero all the way to year four

388:37

and then from there we're actually

388:38

getting a total so in this case this is

388:41

summing up all these values right here

388:44

so why am I doing four years why am I do

388:46

a total left for these four years well

388:48

the Bureau of Labor Statistics basically

388:51

estimates that most people have the

388:53

average tenure at a company of four

388:55

years so the idea with this calculator

388:59

that I've made is that we're able to

389:01

calculate based on a job offer we re

389:04

what would we expect if we were to stay

389:07

at the basically average amount or

389:09

median amount of time that a normal

389:11

person stays at a job like just looking

389:13

at what's the first year because

389:15

sometimes things like bonuses and annual

389:18

raise may actually push us into higher

389:21

salaries even though the base salary is

389:24

lower than another salary so it

389:26

basically helps calculate this out and

389:27

even the playing field for these three

389:30

jobs that we're trying to calculate

389:31

anyway you can go through if you want to

389:33

and and understand what formulas are

389:35

going on behind the scenes here but

389:37

basically I'm just taking into account

389:39

these three parameters right here and

389:41

then every year basically starting with

389:43

that previous years and then adjusting

389:46

it for the annual raise and then giving

389:48

it its appropriate bonus so as expected

389:51

because there's an annual raise on each

389:52

one of these the salaries are going up

389:54

so with that what is going on here do I

389:57

need to actually go through and actually

389:59

put in every single one of those jobs so

390:02

I'll put in job one and get the 566,000

390:05

and then now do the second job and third

390:07

job no I can use scenario manager for

390:10

this so going into what if analysis I

390:12

select scenario manager and we're going

390:14

to add three different scenarios so I'm

390:17

going to come up here and select add

390:20

this scenario name we're going to call

390:21

it job one next we're going to move into

390:25

what we're going to use for the changing

390:26

cells and I've labeled these basically

390:29

or made these into an input format we're

390:32

going to select these three right here

390:33

so C3 through C5 we'll leave the comment

390:36

as is protection as prevent changes and

390:38

go to okay now it's going to ask us what

390:42

values we want to use for each in this

390:44

case I use 100,000 10% and 1.5 it's

390:47

already filled in pre-filled in from

390:49

there I'm going to click okay now we

390:51

need to add job two for this I'm going

390:53

to leave changing cells the same this

390:55

one I'm going to change to 880,000

390:58

15% and then change this bottom one to

391:02

1.2%

391:04

then finally we need to add that job

391:05

three one of the last steps we need to

391:07

do is now go into summary right here and

391:10

for this we need to figure out what we

391:12

want to actually have it provide for us

391:15

in our case we want the result cell of

391:17

C9 through c14 to be provided from there

391:21

we click okay and bam we're going to get

391:24

this scenario summary sheet that goes

391:27

through in details based on job one job

391:31

two and job three for the value that we

391:34

input into it and from there it's going

391:36

to tell us what year zero is year 1 2 3

391:41

all the way down to the total salary now

391:43

one thing to note is you see these names

391:46

of Base bonus raise year zero uh and

391:48

then total salary if I go back to what

391:51

if analysis I've actually gone through

391:53

already for you and actually Nam this so

391:56

in this case I'm selecting zero it's

391:58

named year zero and total salary if I

392:01

were to use things that were maybe not

392:04

named it would just provide the cell so

392:06

if we're using the values here it would

392:08

just going to be provide F6 and in that

392:10

case we would have saw F6 here also back

392:13

in the scenario summary you may not have

392:14

ever saw this before but Excel allows

392:17

this sort of grouping if you will to

392:20

basically manipulate the sheets and what

392:23

values are hidden or potentially shown

392:26

here anyway pretty unique feature that

392:28

you may or may not have seen

392:32

before all right moving on to goal

392:35

Seeker let's say we have the scenario

392:38

now where we got the job offer for job

392:42

one in this case but we want to try to

392:45

match that of job three specifically if

392:48

I go back to that scenario summary sheet

392:49

we can see that job one is at around

392:53

566,000 but job three is at 640,000

392:56

we'll say we have some Insider

392:58

information that human resources told us

393:02

hey we can't adjust the base or the

393:03

bonus but we can adjust the raise what

393:07

raise you get every year and so you

393:09

could potentially ask for a higher Rays

393:12

what Rays would you need to basically

393:14

put into here to get equal to that job

393:17

three so the first thing I'm going to do

393:18

is go in and make sure that we have

393:20

inside of our formula input in the job

393:22

one actual statistics of it so 100,000

393:26

10% and 1.5% for the annual raise now I

393:30

could go through there so I type 1.7%

393:33

and then 1.8% and just keep on going up

393:36

until I actually find what it is or

393:39

instead we can just actually use this

393:40

goal seeker and for this we're going to

393:42

be setting a cell specifically cell

393:45

c14 to that 640,000 that we want to get

393:50

to and we need to provide what cell

393:52

we're going to actually change in this C

393:54

case we're going to change cell

393:57

C5 which is the annual raise no for this

394:00

we can only change one option we're

394:02

going to be able to change M multiple

394:03

the next scenario but not in this one of

394:05

goal Seeker so from there I'll go ahead

394:07

and click okay and Bam automatically

394:11

goes through I don't know if you saw

394:13

that it Ste through it and it went up to

394:16

7.6% and that's what we'll need in order

394:18

to get to that 640,000 and it even

394:21

provides an old nice dialogue box saying

394:23

that hey it did find a solution

394:25

sometimes you may put a goal in that's

394:27

not achievable and in this case it would

394:29

it would tell

394:32

you so

394:34

7.76% is a pretty high raise let's say

394:38

we get further information from HR

394:40

saying hey we can actually change not

394:43

only the annual raise but also your

394:45

bonus we still have the same scenario

394:47

you can't change the base salary needs

394:49

to stay at 100,000 for that first year

394:52

so we have multiple parameters now that

394:54

are changing this is when we're going to

394:56

shift from using this goal Seeker now

394:59

over to solver one thing before we start

395:02

we need to actually reset these values

395:04

in here I'm going to change this back to

395:08

1.5% both of these Step Up in value so

395:10

you want to reset it before you go so

395:13

opening up solver I'm going to set the

395:16

objective as before that c14 of that

395:19

total salary and we want to get it to a

395:22

salary of 640,000 and we want to do this

395:26

by like we said we can change two things

395:28

in this case the bonus and the annual

395:30

raise we can also add constraints which

395:33

we'll do in a second after we just run

395:36

through this one but I want to actually

395:37

just go through and solve it first and

395:40

the last thing we need to look at is

395:42

select a solving method we're going to

395:44

just leave it here I really like this

395:45

grg nonlinear we'll leave it that for

395:48

the time being and we'll go ahead and

395:49

click solve now for this it says solver

395:52

found a solution all constraints and

395:54

optionality conditions are satisfied as

395:56

we can see it increased the bonus and

395:58

then also the annual raise and we got to

396:00

that 640,000 inside of this popup box we

396:04

can have it output certain reports so

396:06

I'm going to just hold control and

396:08

select multiple different reports along

396:11

with clicking this for outline reports

396:13

that's it's going to actually print to

396:14

different sheets and from there click

396:18

okay anyway the most important of these

396:20

three different reports that it gave to

396:22

us feels the answer report basically

396:24

tells us hey what was the original

396:26

values put in for the D bonus and raise

396:29

and then what are the final values in

396:31

order to get to that final value of 6

396:33

40,000 they also have these two other

396:36

reports one on sensitivity analysis and

396:39

the other one evaluating the limits

396:40

which we're going to get to um but these

396:42

I don't find as important so now with

396:44

this with solver we found that we can

396:46

input more than one different input now

396:49

we can also specify constraints if I

396:52

come back up to solver and it says Hey

396:54

in this dialogue box subject to the

396:58

constraint right now the annual raise is

397:00

sort of low still at 2.1% but that bonus

397:04

skyrocketed it was previously at 10% and

397:07

it went all the way up to 23% so we

397:09

could actually put some constraints in

397:11

by clicking add and we'll say hey the

397:13

bonus we're not going to let that exceed

397:17

15% we'll click add for that and then

397:20

for the next one we don't want the

397:23

annual raise to exceed we'll say 4% and

397:27

we'll click okay remember I did name

397:30

these cells so that's why it pops up

397:32

automatically as B and raise makes it

397:34

super easy whenever you name cells all

397:36

right let's go ahead and click solve so

397:39

look at this solver could not find a

397:42

feasible solution with these constraints

397:45

basically maxed out the bonus and maxed

397:48

out that annual raise and we didn't get

397:49

to that 640,000 so what I can do is I

397:52

can return to the solver parameters

397:54

dialogue click okay and in this case

397:57

I'll change the bonus to we'll say 20%

398:01

now and then for the raise we'll change

398:03

this to 5% click okay and then try to

398:07

solve again and we found a solution we

398:11

have 17% and

398:14

4.4% and for this I'm going to Output

398:16

the answers I'll click outline reports

398:19

to export it click okay close this out

398:22

and then go to the report we can see

398:25

what our finally values are along with

398:28

how we got to our 640,000 final value

398:32

all right you got some practice problem

398:33

problem to now go through and try these

398:36

different features out of scenario

398:38

manager and goal seeker and also solver

398:41

and I think once you play around with

398:42

them more you can find out which one is

398:44

more applicable to which scenario with

398:47

that I'll see you in the next section

398:49

where we're going be going into deeper

398:51

into what if analysis specifically on

398:53

data tables one my favorite features of

398:55

what if analysis with that see you

399:01

there let's now get wrapped up on what

399:04

if analysis by focusing on data tables

399:07

we're going to be focusing on building

399:09

one input and also two input data tables

399:12

for the first one on one input we're

399:14

going to be continuing on with that

399:15

exercise from last lesson looking into

399:19

that job offer one and seeing how we

399:21

could change the annual rays in order to

399:25

thus affect different salaries at our

399:28

4-year point and mainly the total salary

399:31

at this point and from there we're going

399:32

to shift into building two input data

399:34

tables where we're not only analyzing

399:37

that annual raise increase but also a

399:39

change in the bonus rate to see what the

399:42

different salaries are for that final

399:44

total amount of those four years so for

399:47

this lesson and also for this chapter

399:48

we're going to be starting with the

399:49

actual workbook of the name of the

399:51

chapter in this case data tables and

399:54

we're going to be working this original

399:55

sheet but I want to jump into that one

399:57

input to basically show you what we're

399:59

going to be building

400:03

we're going to be inputting into here

400:05

the annual raise percentage we're going

400:07

to put it in increments of. 5% and then

400:11

along the top in the row we're going to

400:13

be inputting the values from over here

400:17

um and these values right here across

400:19

the top and then the data table itself

400:22

is going to fill this in with the

400:26

expected result so in this case year

400:29

three at 2% s uh 2% increase in arrays

400:33

it's going to get around

400:35

116,000 we also do for coloring at the

400:37

end uh the data tables don't do that

400:39

that's done with conditional formatting

400:41

so here we are back in the original

400:42

sheet first thing we need to do is get

400:44

the annual Rays put up here remember we

400:48

want to go in we'll say. 5% increment so

400:51

I'll do zero

400:54

0.5% and then for the rest of these I'll

400:57

just drag them down I end up messing up

400:59

the formatting so I'm just clear the

401:01

borders and then put a border back

401:04

around the outside now for the salaries

401:07

I want that to be for what year zero

401:10

then year 1 I'll drag this on over for

401:13

these and then we'll put a total so for

401:15

this I want to enter in that year zero

401:17

we're going to be doing this for all the

401:18

different values right there I'm going

401:20

go ahead and put it in if you notice it

401:22

has this line through it and I actually

401:24

click it and then from here whenever I

401:27

look into it it provides the error of

401:28

stale value you may or may not see this

401:32

but I'm going I show you how to fix this

401:34

if you are experienced this you can go

401:36

into file and then into more under

401:39

options and what happens is under

401:42

formulas my workbook calculations went

401:45

from basically automatically calculating

401:47

to manually where under manually if I

401:50

look at this little icon right here they

401:52

can be manually calculated by pressing

401:55

F9 or going to formulas calculate now

401:58

anyway there's nothing wrong with having

401:59

automatic calculations that's actually

402:01

what I want all the time somehow my

402:02

thing switched into this manual if yours

402:04

does switch it back to automatic click

402:07

okay bam we're good to go and we'll

402:09

continue on now it's important for up

402:12

here at the top that we have them equal

402:15

to the formulas here because this is

402:18

what's going to be ultimately getting

402:20

changed and manipulated so I wouldn't

402:23

want to go through and actually manually

402:24

fill this in with a 110,000 it needs to

402:26

be connected to the formula that

402:28

actually is getting calculated so

402:30

building our data table now I'm going to

402:32

select this entire range right here E3

402:36

all the way down to K12 go to the data

402:38

Tab and select data table now this

402:41

provides us two inputs a row input cell

402:44

and a column input cell we're only doing

402:46

a one input data table so we only need

402:48

to fill in one of these specifically

402:51

we're looking for the input either into

402:53

the row or the input into the column in

402:55

this case we're going to be subbing in

402:57

this this column this e column right

402:59

here we're going to be subbing it into

403:01

the formula here and it wants to know

403:03

what is the input cell for in this case

403:05

the column so I'm going to go ahead and

403:07

select it it's C5 I'm gonna go ahead and

403:10

click okay and it's going to

403:13

automatically fill it in now what's

403:15

unique about this is I could also go in

403:17

here if I wanted to and maybe change

403:19

this to something like 10% and it will

403:23

update this entire data table with that

403:26

new value I'm actually going to change

403:27

that back to 3% but pretty unique anyway

403:30

if I wanted to I can come in also and

403:32

I'll so go in and to conditional format

403:35

it I'm only going to select Euro 0

403:37

through four and I'm going to do a white

403:39

to green and then for the total I'm

403:41

going to do its own because it's almost

403:42

in its own bracket here right it's a a

403:44

sum of all those different values so I'm

403:46

also going to do the same thing of the

403:49

white to green and then you know me I

403:51

don't really really like green so I'm

403:53

going to go ahead and select this and

403:54

I'm going to end up changing this by

403:55

going into manage rules and conditional

403:57

formatting selecting on this one

404:00

adjusting the color to Blue and also

404:02

selecting this one and changing this one

404:04

to Blue as well click apply and then

404:07

okay and

404:11

Bam so with that example complete let's

404:13

move into a two input data table and

404:15

let's look at the final example for this

404:17

for this we're going to have as we had

404:21

before the annual rays in the column but

404:24

this time we're going to have the bonus

404:27

up on that top row and for this we're

404:30

going to be calculating as we click here

404:32

it's going to be calculating

404:33

c14 which is the total salary we're not

404:36

going to be calculating that 0 1 through

404:38

4 anymore and it's going to go through

404:40

and calculate it for all of these

404:43

different scenarios if you will all

404:45

right to do this I'm going to go back to

404:47

that original sheet I'm going to

404:49

actually duplicate this by saying copy

404:52

it create a copy and click okay okay so

404:56

now we have original two so I'm going to

404:58

name original to one input and then

405:02

rename or two to two input now for this

405:05

one I'm going to end up just clearing

405:07

the contents from here I'll go to

405:09

editing clear and I'll just select clear

405:12

contents and now thinking about it I

405:14

want to also clear any of the formatting

405:16

that's in here cuz we're going to be

405:17

doing something different with it I can

405:19

go into clear rules I can go clear rules

405:22

from entire sheet all right so we're

405:24

have the Rays and the rows and now we

405:27

need the actual bonus in the columns for

405:30

this we'll go from 0% to

405:33

5% and I need to actually change this

405:36

formatting to actually be a percentage

405:38

and then drag this all the way through

405:40

along with fixing this formatting so now

405:43

a two input data table is a little bit

405:46

different in that we need in the upper

405:48

left hand corner what we actually want

405:50

to change whereas the one input put we

405:52

did across in in our case we did across

405:55

the rows in this case we just want to

405:57

have in the upper left hand corner there

405:59

it is I sort of grayed it out you can

406:01

make it a little bit darker if if you

406:03

want to but I would just want to make it

406:05

known that hey we're not necessarily

406:06

using it so similarly we're actually

406:08

going to get into creating it we're

406:09

going to select the entire data table go

406:11

to the data tab what if analysis data

406:13

table for the row input cell so this row

406:17

up here what are we wanting to

406:19

substitute these values into well we

406:21

want to sub it into the bonus and then

406:25

similarly for the column input same as

406:27

last time that's the annual raise so

406:29

we're going to want to sub that into C5

406:32

going go ahead and click okay so I'm

406:34

going to dress this up a little bit I'm

406:36

going to bold the header right here also

406:38

I'm going to merge and center this all

406:40

so we can put inside of here bonus and

406:43

then finally I'm going to conditionally

406:45

format it like we did last time using

406:47

that white to green and then changing

406:50

that green to a blue to get it more of

406:53

what I want so bam now we have a two

406:56

input table and we can see what it's

406:59

going to be across all these things also

407:01

with this if you remember from our last

407:03

lesson right we were looking at finding

407:05

what is the value we'd want to be to get

407:07

around 640,000

407:10

now we have a few different values we

407:13

can actually look at for this and we can

407:16

tell from this well we going to need to

407:18

be above a bonus rate of 15% to even be

407:20

considered to get up to 640,000 so

407:24

sometimes I like this visually better

407:26

than going in and doing something like

407:28

goal Seeker or even things like solver

407:30

because now I have multiple different

407:33

variables I can look at and analyze and

407:35

try to adjust on my own all right so you

407:38

now have some practice problems to go

407:39

through and get familiar with data

407:41

tables I found when I first started with

407:44

data tables got really confused on the

407:46

row input and also the column input

407:49

cells but really understanding how those

407:51

are being applied into the original

407:53

formula helps you figure that out all

407:55

right with that I'll see in the next one

407:57

where we're going going into the

407:58

analysis tool pack and diving into a lot

408:01

of different statistical analysis you

408:02

can do with Excel so with that see you

408:08

there all right this is the last lesson

408:10

in this chapter on Advanced ad analysis

408:13

and specifically we're going to be

408:14

focusing on that analysis tool pack

408:16

addin now this addin is packed full of

408:19

features and I can make a whole tutorial

408:21

just on this addin alone but we're only

408:23

going to be focusing on four core things

408:25

of it that it does that I use from time

408:27

to time on our job posting salary data

408:30

set of over 30,000 rows first we're

408:33

going to look at how we can get

408:34

descriptive statistics of something like

408:36

a salary column so we don't have to go

408:38

through and use formulas to get all the

408:39

different statistics for it second we're

408:41

going to investigate how to make

408:43

histograms but these are a little bit

408:45

with a Twist in that I feel like they're

408:47

more customizable than the previous

408:49

histograms we can make third we'll get

408:51

into ranking and assigning a percentile

408:55

for our salary data so we can understand

408:57

where it actually ranks for percentiles

409:00

and then finally we're going to be

409:01

moving into looking at at a moving

409:03

average if you remember our job posting

409:05

data set had all the seasonality in it

409:07

basically went up and down a lot

409:08

depending on where it was posted during

409:10

the week well we can remove those

409:12

fluctuations by a moving average for

409:15

this we're going to be working in the

409:16

analysis tool pack workbook and all the

409:19

answers in there so you can feel free to

409:21

go ahead and actually select all the

409:24

different sheets in here and go ahead

409:26

and hide them so we only have the data

409:28

tab in there and we'll be working with

409:30

this

409:34

so as a refresher this is the data

409:36

analysis tool pack you should have gone

409:38

through in that first lesson and

409:40

actually enabled it by going into

409:43

options into the addins itself and it

409:45

should now be under the active addins if

409:48

you didn't do that remember you all you

409:50

have to do is just go into go into here

409:52

and select it all right so let's open

409:54

this bad boy up and if I click that

409:56

analysis it's going to pop up here and

410:00

this dialogue box allows us to select

410:03

like I said from a variety of different

410:05

tests that we can actually perform

410:07

there's a lot of different statistical

410:09

tests in here such as regression and

410:11

sampling and then even things like

410:14

correlation Co variance and whatnot so

410:16

let's start with the one that I find

410:17

myself using the most and that's

410:19

descriptive statistics when I want to

410:20

perform Eda or exploratory analysis this

410:22

is the first thing I want to do now the

410:24

thing about this is we need to provide a

410:26

column that has numerical values in it

410:29

so we could do the date column but what

410:31

we're going to do is we're going to to

410:32

provide the salary year average column

410:36

go ahead and press enter for this we do

410:38

have labels in the first row so I need

410:40

to click this here for output options we

410:42

want to go to a new worksheet so that's

410:45

what we'll leave for this and with this

410:48

we do want the summary statistics you

410:51

can go in and also specify things like

410:53

confidence level and the cith largest

410:56

and kith smth but we're going to leave

410:57

those default for the time being and

410:59

click okay now it's popped up in this

411:01

new sheet called cheap one and diving

411:04

into it I'm actually going to expand

411:06

this out and then format all these

411:08

numbers real quick so that's much more

411:10

readable so now we have all the key

411:12

statistics from it we don't have to go

411:14

through and calculate a formula for mean

411:16

median mode standard deviation the

411:19

minimum maximum sum

411:23

whatnot all right next up is histogram

411:26

and previously remember we could just

411:28

select something like the M column go

411:29

into insert here and actually insert a

411:32

histogram now the one problem I have

411:35

with this is the formatting of the rows

411:38

or the X values down here it basically

411:41

provides this range this is a lot of

411:43

data right there and there's it's really

411:44

hard to format this so let's look at an

411:47

alternate option for this using the data

411:49

analysis tool pack specifically we're to

411:52

come in here to histogram for the input

411:54

range once again I'm going to go ahead

411:56

and just select that column M press

411:58

enter it does have labels for bin range

412:01

I'm going to leave m I'm not going to

412:03

specify a width of the histogram or the

412:05

bin I'm going to leave it just default

412:08

for the output I'm going to leave it as

412:09

the new worksheet ply I don't want

412:11

either of these the parto or the

412:13

cumulative percentage instead I just

412:15

want the chart output of this press okay

412:18

and here we have the histogram it's

412:20

honestly not too special it's a little

412:22

hard to read based on the size of these

412:25

bins as you can see basically the

412:27

difference between these is around it

412:30

looks like they're doing basically an

412:31

thousand increments so the increments

412:33

are way too small we need to adjust the

412:35

bin anyway the one good thing is along

412:38

this xais it's only one value now so a

412:41

lot easier to read so now let's go in

412:44

and adjust that bin size so if I go back

412:47

to data analysis into histogram and

412:49

click okay for the bin range it wants me

412:51

to actually put in a range or a

412:54

selection so we need to actually

412:56

pre-fill out what range or bins we want

412:59

for this so I'm going to copy this

413:00

header up here cuz we're going to keep

413:01

the bin in frequency start a new sheet

413:05

paste it in here and I want to go in

413:08

we'll say 50,000 increments so 0

413:11

50,000 and I want it to go to basically

413:16

400,000 so now going into Data analysis

413:19

again histogram opening it back up still

413:21

has the input range selected correctly

413:23

now for the bin range I'll select A2 to

413:27

A10 select the output range to I

413:30

basically want it to be inside of of

413:32

this notebook so I'm going to select up

413:34

here on D1 we'll just start there and we

413:37

want a chart output on this page okay

413:39

I'll click okay and I'm getting this

413:41

error message that the input range must

413:42

contain at least one data point right

413:44

now this Elm is not referring back to

413:47

the correct sheet it needs to look at so

413:49

actually I'm going to select right here

413:51

you can see it selected that other sheet

413:52

I actually want to select the M column

413:54

of the data tab now we'll press okay so

413:58

now I love this because wanted output

414:01

this I didn't apparently need to do this

414:03

frequency thing I got confused anyway we

414:06

can actually go in and format this to

414:09

remove the legend and then update the

414:12

axess title for salary and then we'll

414:15

update this one for frequency anyway I

414:17

really like this because now look at

414:19

this control we were able to minimize it

414:23

not to go past 40,000 and have all these

414:26

outliers and everything else that has

414:28

past 40,000 is put into this basically

414:31

more value you anyway this is my

414:33

preferred method for making histograms

414:35

especially whenever I need to control

414:37

that

414:40

xais next up is Rank and percentile and

414:43

with this one we're going to be doing a

414:44

rank and percentile of that salary year

414:46

average column once again now this one

414:49

depending on the size of your computer

414:51

may take up it may even crash your

414:53

computer so if you're concerned that

414:56

this is not going to be able to

414:58

performed on your computer don't run it

415:00

just look at my example and understand

415:02

what get out of it anyway I selected

415:03

rank in percentile and then for the

415:05

input range once again I'll select that

415:08

column M and then we'll output it to a

415:10

new worksheet ply and I can do something

415:12

like even name it in this case calling

415:15

it Rank and percentile of the sheet that

415:16

it's going to go to so clicking okay it

415:19

says Rank and percentile input range

415:20

contains non-numeric data basically I

415:24

forgot to click this of labels in the

415:25

first row clicking again it's thinking

415:28

how long is it going to take all right

415:31

so Excel just on me maybe that wasn't a

415:33

great idea let's try that again using

415:35

Rank and percentile this time instead of

415:38

selecting the whole column I think

415:40

because it had some blank value

415:41

especially down to a million rows sort

415:43

of crashed it instead what I'm going to

415:45

do is I'm going to just select A1 and

415:47

then select down all the way to the

415:49

bottom I don't know why it changed it

415:51

over to column F but the main point of

415:52

me to doing this is that way we select

415:55

column M and also I need to remove this

415:59

A1 at the beginning okay and also need

416:01

to update this to be starting the second

416:03

cell and we're going to try this again I

416:05

gave it the name of rank percentile I

416:07

didn't have the labels in first row

416:08

selected because we're going from the

416:09

second cell how long is it going to take

416:11

this time all right so that was a lot

416:14

quicker this time and we have our now in

416:16

this Rank and percentile sheet our

416:18

actual data it did take about a minute

416:21

to do so once again if you have a

416:22

computer that's not necessarily that

416:24

fast don't try this at home all right so

416:27

some key statistics about this it

416:29

provides a point which is the row number

416:32

it's itself and then from there what is

416:34

the value that's the column the rank and

416:37

then the percentile what's cool about

416:39

this because of provided point we could

416:41

do something like the index function and

416:44

you provided an array and then the row

416:46

number in this case that's the row

416:47

number so if I wanted to find out what

416:50

the job title is I could select column B

416:54

and then from there for the row number

416:56

go back to rank and percentile and

416:58

select this value right here then close

417:01

parenthesis press enter looks like it's

417:03

a clinical NLP data scientist and I can

417:06

actually autofill this all the way down

417:09

anyway let's make sure this is actually

417:11

correct okay yeah just double checking

417:13

the row number at 25589 is clinical NLP

417:17

data scientist so we have it correct

417:19

anyway I could go through now and I did

417:21

this for the job title itself but you

417:23

could imagine you could pull out things

417:25

like the job country job tile short all

417:27

sorts of other key information and get

417:29

this in a list of what it's rank is

417:32

along with its

417:36

percentile our last feature to look at

417:38

is moving average and this is what we're

417:41

going to be calculating here the Blue

417:43

Line already is data we already have of

417:46

what are the job postings over time but

417:48

that orange line is the moving average

417:51

we can use this analysis tool pack in

417:53

order to calculate this and as you can

417:55

see it removes a lot of these fluctu

417:57

these weekly fluctuations if you will

418:00

from it and makes it a lot more are

418:02

basically readable to see where actual

418:04

the Peaks and the troughs are now in

418:06

order to do this I can't necessarily

418:08

just put in that job posted date into it

418:11

I have to actually get a count of the

418:15

dates and also what are the counts of

418:17

the job postings per date so we need to

418:19

create a pivot table so we go in insert

418:21

pivot table from table we're going to do

418:23

it from this table which is named jobs

418:25

and we're going to insert it into a new

418:27

sheet similar before we're going to put

418:29

that job posted date into the rows and

418:32

I'm actually going to take out you can

418:33

see it aggregated by month I'm going to

418:35

take out the month from there so it does

418:38

by days and now I'm going to throw into

418:41

the values here it's going to do a count

418:43

so I'm just change this to job count and

418:48

we can actually visualize this by itself

418:50

by going to insert pivot charts

418:51

inserting in a pivot chart we want a

418:55

line and that's what we saw before with

418:58

our Blue Line before that showed how it

419:00

basically went across uh went through

419:01

time

419:02

all right so goes ahead and I'm going to

419:04

delete this chart because we're going to

419:05

be making it and once again we're going

419:07

to that data tab into Data analysis and

419:10

we're going to be forming moving average

419:13

for the input range I'm going to select

419:15

B4 and then select all the way to the

419:18

Bottom now this grand total went into it

419:21

so actually I'm going to back up one and

419:23

change this to 368 we didn't select any

419:26

labels in the front row so I'm going to

419:28

leave that on blank in the interval I'm

419:30

going to just set it something like

419:32

seven for the time being for the output

419:35

range I want it to go right next to my

419:36

chart so I'm going to copy this above

419:39

and paste it below and change these B's

419:41

into C's so it's C values right next to

419:45

it and we want a chart output along with

419:48

standard errors I'm going to go ahead

419:49

and click okay now this chart is not

419:52

correct um we made a little bit of a

419:54

mistake but I did want to show you real

419:56

quick this moving average we can see

419:58

that it starts 7 days later right here

420:02

and so that's what's happening in this C

420:04

column here that's the actual moving

420:06

average and then the actual error itself

420:08

is right next to it it's pretty

420:10

consistent around 30 to 40 anyway we

420:12

need to fix this we need to take this

420:13

entire value if you will and move it out

420:16

of a pivot chart so I'm going to select

420:18

this all the way down to the bottom and

420:21

copy it then inside of a new sheet I'm

420:23

going to come in and paste it I'm going

420:25

to just paste looks like a pasting with

420:27

the pivot table formatting I'm going to

420:28

paste uh the values only and change this

420:32

to job date so let's try this again

420:34

using data analysis going to moving

420:36

average for the input range we're going

420:39

to select B2 and then all the way down

420:41

to the bottom remember this has a grand

420:43

total so I actually need to change that

420:45

to minus one for the interval I'm going

420:48

to adjust it a little bit I'm going to

420:49

actually change this now to a 21-day

420:52

moving average and then for the output

420:53

range this actually needs to be adjusted

420:55

to match what the input range is but for

420:59

b or c sorry anyway go go ahead leave

421:02

everything else checked click okay and

421:05

Bam now we have blue and also orange if

421:09

you will for the actual and the forecast

421:13

now one thing I'm noticing with this

421:14

chart is well the markers are pretty

421:18

heinous they're making they're clogging

421:20

up this chart so what I can do is Select

421:22

something like this orange line right

421:23

here I can rightclick it go to format

421:26

data series and then here underneath

421:28

this fill and line go into markers and

421:32

then for the marker options just

421:34

basically do none we just want to have a

421:36

line instead additionally we can just go

421:38

ahead and click that blue the blue line

421:41

and for the markers there we can do none

421:43

as well okay sensory overloads gone now

421:46

looks a lot more readable with the

421:49

exception of down here for some reason

421:51

it didn't pick up the dates on mine and

421:54

we can adjust that by right clicking

421:57

that and going to select data underneath

422:00

the horizontal ax labels I'm going to go

422:02

ahead and edit this I'm going like from

422:05

A2 all the way down minus one we don't

422:08

to do grand total click okay that

422:11

changed the names let's see if that

422:12

updated the chart and Bam it did now I'm

422:15

going to do some minor cleanup I'm going

422:17

to remove that Legend from there and

422:21

that looks a lot better so now we have a

422:23

graph of our moving average of the job

422:26

postings and as we sort of suspected in

422:30

August we had a peak along with January

422:32

seemed sort of high then went down a

422:34

little bit but then up again in August

422:36

so we see a lot more Trends and then

422:37

tapering out towards the end of the year

422:40

all right now it's your turn to go

422:41

through and practice with those practice

422:43

problems and exploring some of these

422:45

features in the analysis tool pack add

422:47

in with that we're going to be wrapping

422:50

up this chapter and in the next one

422:52

we're be jumping into Power query which

422:55

I'm super excited about in order how to

422:56

clean up our data and load it in in the

422:59

format that we want easily all right

423:01

with that see you

423:06

there welcome to this chapter on power

423:08

query and no pun intended but this is

423:11

one of the most powerful tools within

423:14

Excel it allows us to perform ETL

423:17

processes or extract transform and load

423:21

which just some fancy data engineering

423:23

talk for connecting to a data source and

423:25

loading it in after you clean it up

423:27

anyway in this chapter we have five

423:30

lessons specifically in this one we're

423:31

going to have an intro to power query

423:33

what it's all about how to actually

423:35

connect to a data source in the next one

423:37

we'll be moving into the power query

423:39

editor and we'll be covering that for

423:42

three lessons in order to go in how to

423:44

actually clean up your data and get it

423:46

prepared to a format that you want in

423:49

the last lesson we'll be diving into the

423:51

M language which is powering power query

423:55

don't worry we're not going to do any

423:56

in-depth coding or anything like that

423:58

just want you to have some familiarity

423:59

with you so we have more experience with

424:01

using power

424:05

query so what's this lesson about well

424:08

in order to understand that we have to

424:10

understand is what is power query and

424:13

here on Microsoft's learning platform

424:16

they have this fancy Dancy diagram that

424:17

basically shows this what power query

424:20

does it allows us to connect to

424:22

different data sources it could be

424:23

something like a database a text file or

424:26

even something on the cloud from there

424:29

power query will then pipe it in to a

424:32

bunch of different products they have

424:33

and we're going to be using it for

424:34

Microsoft Excel but it's also famously

424:37

also in powerbi now if you have a

424:39

Windows version of excel power query is

424:41

going to work just fine on the Mac

424:43

versions it is available however it's

424:47

very limited so a lot of the stuff we're

424:49

going to do within this lesson you're

424:51

not going to be able to do and also

424:53

Microsoft online is just completely not

424:55

available so as a reminder power query

424:57

is an ETL tool or extract transform load

425:01

and we can connect to as a data source

425:03

such as this here's a Wikipedia page on

425:05

the list of S&P 500 companies and it has

425:09

all the different 500 companies that are

425:11

part of the S&P 500 anyway let's say I

425:14

want this table I could go through and

425:17

try I mean as you can see I'm trying to

425:18

select it right now and it's like

425:19

selecting the whole page it's a whole

425:22

mess if I'm trying to get this but we

425:24

can actually use power query to extract

425:26

all this components out all I have to do

425:29

is go in and provide the web address of

425:32

this which I know it's located right

425:34

here I'll then select which of the

425:36

tables I want out of the web page which

425:38

is this one right here and then I just

425:40

load it in and here it is in our

425:43

workbook now don't worry I sort of ran

425:45

through that example real quick we're

425:46

going to go more in depth and Detail in

425:48

the last example in this lesson but I

425:51

just wanted to show the power of this

425:53

and how we can actually get data even

425:55

from online into our workbook so easily

425:58

so why do we need to use power query

426:00

well we're going to find that out as we

426:02

go along but I'm going to give you the

426:04

tidbits right now of One it automates

426:07

the ETL process so I don't have to do

426:09

that annoying task of going to a sheet

426:12

and copying it over every time I get new

426:14

data I can just get power query to do it

426:16

for me additionally with that sometimes

426:18

I may have mistakes I'm copy and paste

426:20

and sheets over therefore I have

426:23

reproducibility and then finally with

426:25

this I'm now allowed to bring data in

426:28

that potentially exceeds that 1 million

426:30

row limit of Excel which we'll show how

426:33

we can deal with that in a bit so let's

426:35

actually get into performing our first

426:36

example of loading in a simple data set

426:39

specifically from another Excel sheet

426:41

like I talked about the beginning of the

426:42

advanced chapters you're not going to be

426:44

able to actually work inside of the

426:46

workbooks that I have given so in this

426:47

case power query intro has the final

426:50

results but I don't want you working in

426:52

that I'll tell you what works you need

426:54

to be working with as we go through this

426:56

which you're probably getting the

426:56

security warning of external data

426:58

connections have been disabled and we'll

427:00

get to troubleshoot shooting that at the

427:02

end so instead we're going to be

427:03

starting with a new blank workbook I'm

427:06

going to go to navigate over here to the

427:07

data tab this is where power query is

427:10

located specifically under this get and

427:13

transform data it doesn't really say

427:15

power query but that's where power query

427:17

is hidden now anytime I'm importing any

427:19

data I typically go to this get data and

427:22

then from there I navigate Down Deeper

427:25

depending on it's file database from

427:27

fabric and power platforms or from even

427:30

other sources they do have for all these

427:32

for it here they also have smaller icons

427:35

right next to it that you can navigate

427:37

over and basically highlight okay this

427:39

is from web and then this is from a

427:41

table of range and whatnot we're going

427:43

to be going over multiple examples in

427:45

this video so don't worry if you're not

427:47

following along with which data sources

427:48

you can actually import I think you'll

427:50

have a good idea by the end of

427:54

this so what are we going to import

427:56

first well if you navigate into our

427:59

course folder under resources under dat

428:01

ass sets and then data jobs monthly we

428:04

have Excel files for every single month

428:08

we're going to start by just importing

428:10

one Excel file to start and then in the

428:12

next exercise we'll go into how to

428:14

import all these at once anyway we're

428:15

going to start simple first with just

428:17

this Excel file so for this I'm going to

428:19

go to get data and it's a file

428:22

specifically it's from an Excel workbook

428:25

inside the course folder I'm going to

428:27

then navigate to the data set going to

428:29

resources data sets monthly and then

428:31

select that January data set and click

428:33

import with power query you're going to

428:35

find that it has this Navigator window

428:37

pop up and from there it will show you

428:40

what is actually importing in in this

428:42

case January data jobs the Excel sheet

428:45

and then if it had one or multiple

428:47

sheets it will appear there underneath

428:49

it whenever I select sheet one it then

428:51

shows me to the right hand side a

428:53

snapshot or a preview of all the

428:56

different data in there it doesn't show

428:58

all the columns but a snapshot of it at

429:01

the bottom there's a few options to load

429:04

or load to and then also transform we're

429:08

going to keep it simple for the time

429:10

being and we're just going to load so

429:13

I'll go ahead and click it so we just

429:15

imported in this data set from another

429:17

worksh sheet it's already in its own

429:20

table and because it also was sheet one

429:24

it's naming the sheet sheet one

429:26

parenthesis 2 to signify as the second

429:29

one so congratulations we just completed

429:31

our first ETL process of actually

429:34

extracting transforming and loading an

429:36

Excel workbook into another

429:41

workbook so we loaded this table in but

429:44

how do we actually go about using it

429:47

well in this portion we're going to be

429:48

demonstrating how we can manipulate it

429:49

with a pivot table and how to basically

429:52

control all our different queries if you

429:54

notice we had over on the right hand

429:56

pane this queries and connections now if

429:59

it's not popping up you can go up here

430:01

to the data Tab and then you see queries

430:04

and connections you can navigate it on

430:06

and off by clicking this button power

430:09

query sets up these queries and in this

430:11

case it named it sheet one after the

430:13

sheet one in that workbook that we

430:15

exported in I'm sorry that we imported

430:17

in and if we hover over it we can get

430:19

some details about the columns when it

430:22

was last refreshed it's load status and

430:24

even data source now connections over

430:28

here on the right right now we have zero

430:29

connections that's actually what's

430:31

controlled by power pivot which we're

430:33

going to be going over in the next

430:35

chapter on power pivot but anyway back

430:38

to Power queries itself right now we see

430:40

with sheet one that 3,000 rows are

430:43

loaded and if necessary we go through

430:46

and refresh the data set as showing it

430:48

loaded the data and 3,000 rows are

430:50

loaded again pretty quick so let's

430:52

actually get into manipulating this well

430:54

it says that 3,000 rows are loaded but I

430:56

actually I can go in and delete this tab

430:59

and it's going to give you this warming

431:00

that's going to per delete the sheet do

431:02

you want to continue yes I do and

431:04

whenever I do that since the data is no

431:06

longer loaded it now displays that it's

431:09

connection only so we can actually

431:11

change where we load our data to if you

431:15

will and I can get to this by right

431:18

clicking it and then going into here and

431:20

we'll be exploring all these other

431:22

options as we go through but I'm only

431:24

want to focus right now on this load to

431:27

and they have a few different options in

431:28

here let's actually explore them right

431:30

now it has only create connection so

431:32

right now it only has a connection if we

431:34

go back to that table and click okay it

431:37

once again loads it into that table if

431:40

we want to actually get into a pivot

431:42

table we'll select this on pivot table

431:44

report we can also do a pivot chart and

431:46

it asks whether we want to put it in the

431:48

existing worksheet or a new worksheet

431:50

and then finally it has ADD this data to

431:53

the data model you've seen this one

431:55

before and once again we're going to be

431:58

going over data models more in depth in

432:00

chapter eight on power pivot so we're

432:02

not going to be enabling this checkbox

432:05

just yet anyway I went in the existing

432:07

worksheet I don't need that table there

432:09

so I'm going to click okay and it says

432:10

hey there's possible data loss because

432:12

we're going to be basically getting rid

432:13

of that table and replacing it with a

432:15

pivot table do I want to continue yeah

432:18

and now like we did before in the pivot

432:19

table chapter we're now using a pivot

432:22

table and so we can put things like job

432:24

title short and analyze it for the count

432:26

of different jobs that it has within it

432:28

there's no change whatsoever in

432:29

everything we learn in pivot tables

432:31

still same application that we're using

432:33

it here

432:36

for so now let's actually get into

432:39

importing multiple different Excel files

432:41

we're going to specifically be importing

432:43

all 12 of these of January through

432:46

December this time whenever I go into

432:48

the data tab under get data and we want

432:50

to get it from a file but I'm not going

432:53

to select an Excel workbook instead what

432:55

I'm going to do is select a folder

432:57

because all those Excel files are in the

433:00

same folder inside my course I'll

433:02

navigate into resources data sets and

433:05

then I'm going to select the folder

433:06

itself and select open now you may

433:08

notice the Navigator window looks a

433:10

little bit different and that's because

433:12

now it contains the metadata of these

433:15

Excel files itself such as the name data

433:18

access modified created and whatnot and

433:20

with this one before we had that load

433:22

and load to along with transform data

433:25

we're just going to go into combining

433:27

this data set so I'm going to go ahead

433:28

and click that and specifically we're

433:30

going to use combine and load now we

433:33

navigate to a window we're more familiar

433:35

with of combined files and what this is

433:38

doing is showing is how it's going to

433:40

actually combine the files in that we

433:43

need to make sure one that they're all

433:44

the same format but if I actually click

433:47

sheet one of which the sample file is

433:50

looking at is the first file this is

433:52

what it looks like and we know this

433:54

already because we looked at the January

433:55

file anyway if you wanted to you could

433:57

also change this to a specific file I'm

434:00

fine with just using first file

434:01

selecting a sheet if I was having errors

434:04

I would do skip files with errors but

434:05

I'm not worried about that just yet I'm

434:07

going go ahead and click okay and Bam

434:09

now we have that once again that table

434:11

loaded into here and this has all the

434:13

data so I expect it to have around

434:15

30,000 results similar to what we've

434:18

been working with before and it looks

434:20

like it does and if you notice we have

434:22

this new column right here on Source

434:24

name which tells which Excel file each

434:27

of these comes through and just doing a

434:29

cursory check it looks like all the

434:30

different months are in there now onto

434:33

this queries and connections paint up

434:35

here I'm going to actually make this

434:36

smaller so I can actually see it all

434:38

previously we only had our sheet one

434:40

query but now we have also this data

434:43

jobs monthly query and with that up here

434:47

at the top because we're connecting

434:50

multiple different files we have these

434:53

helper queries that were created during

434:55

the process so you can navigate over

434:57

these and basically see that hey it used

435:00

the September file as a sample and this

435:02

is the steps it took or this is what the

435:04

sample file actually looks like anyway

435:07

I'm not too concerned with those helper

435:08

queries right there or with anything

435:10

underneath this transform from files I

435:12

mainly care about what's under those

435:14

other queries so we have sheet one and

435:16

data jobs monthly speaking of which

435:18

sheet one is a really bad name for this

435:20

so I'm going to rename this to data jobs

435:23

January I also rename the sheet so just

435:25

to prove with the data jobs monthly that

435:27

we actually imported it all in we're

435:30

going to go in and load to and we're

435:32

going to do this time a pivot chart

435:35

going to go ahead and click okay we're

435:37

doing the existing sheet with the table

435:39

I don't care if I get rid of that table

435:40

so I'll click okay and similar four I'm

435:42

going to put that job title short this

435:44

we're going to put in the Axis or the

435:45

rows and then we're going to want a

435:47

count of that as well and then I'll just

435:49

organize this in descending order based

435:52

on the count of job title short so bam

435:55

we now connected with power query to

435:58

multiple Excel files and imported in at

436:01

once I hope you realize that now this

436:05

unlocks a lot of potentials because say

436:07

you get January of next year's data you

436:10

could just put it into this folder here

436:13

and then just all you need to do is go

436:15

back into the data tab click refresh

436:18

it's going to go through and refresh all

436:20

that data set and pull those new numbers

436:25

in all right in this example you're not

436:28

going to follow along I'm just want to

436:29

show the power a power query okay the

436:32

pun's getting old by now anyway I have

436:34

this CSV pile or comma separated values

436:38

basically it has comma separating

436:40

everything I looked at this is in VSS

436:42

code don't worry about any of this stuff

436:43

like I said you're not doing it the main

436:45

point is to show this data set itself

436:47

right here it's starting at the top row

436:48

of one and if I scroll all the way down

436:51

we get to the last entry and that's

436:55

2.7 million jobs that I have here in

436:59

this data set we can actually import

437:02

this into Excel now if you recall if you

437:04

scroll all the way down to the bottom of

437:06

excel it only includes about 1 million

437:09

rows so how the heck are we going to do

437:11

this with power query so this is a CSV

437:13

file I'm going to go to data get data

437:15

from file specifically it's a text or

437:17

CSV and I'm going to import in this data

437:19

jobs large file that I have reminder

437:21

again you don't have access to this file

437:23

it's just too big to even get onto

437:24

GitHub so that's why this is a demo only

437:27

this is the data set itself so I'm going

437:29

to go in and actually go and look load

437:31

it now this has taken a little bit of

437:33

time as you can see it's loading around

437:35

100,000 rows as it goes through also it

437:38

had well it has three errors now in here

437:41

this usually appears whenever it has a

437:43

row of data that doesn't necessarily

437:45

make sense for what it's supposed to

437:46

import it alert you there's an error so

437:48

the 2.7 million rows are loaded but I

437:50

get this error message the query

437:52

returned more data that will fit on a

437:53

worksheet remember it automatically by

437:55

default tries to load it into a table

437:58

into Excel and it's telling me that hey

438:00

it's not going to fit so I'll click okay

438:03

now it's still going to try to load that

438:04

table but it's going to cut it off at

438:07

that 1.5 million but this doesn't mean

438:10

we can't analyze it if I scroll over

438:13

this query it reminds me that the

438:14

results of this query is too large to be

438:16

loaded to the specified location

438:17

worksheets have a limit of 1 million

438:19

rows sure instead what I'm going to do

438:21

is go and load to and I'm going to load

438:23

to a pivot table click okay and it's

438:26

going to warn me again about the table

438:28

loss yeah I know so once it loaded like

438:30

a hot minute to do that I can actually

438:32

go through and now analyze these 2.7

438:35

million rows so if I do something like

438:37

put the job posted date into the rows

438:40

and we also want and we want to get a

438:42

count of this so I'm going to put the

438:43

job poster date also into the values so

438:45

we get this counts anyway reformatting

438:47

it with commas to actually be able to

438:48

read this now we can see that we did

438:52

actually get in 2.7 million different

438:54

data points for this and as a side note

438:57

this is all the data that I've collected

438:59

since I started in 2022 doing this so

439:03

there's a lot of different jobs so Excel

439:05

is not necessarily limited to just

439:07

analyzing 1 million rows of

439:12

data all right so let's finally get into

439:14

that last example of importing in this

439:16

list of S&P 500 companies feel free you

439:19

don't necessarily have to do this table

439:20

from Wikipedia but I'll drop a link

439:23

below on where this table is located and

439:25

you can use that if you want so I copied

439:27

the web page of that table then I'm

439:29

going to come in here and select like

439:30

this of from web you can do basic or

439:33

Advanced with Wikipedia it's perfectly

439:35

fine to do the basic version putting in

439:37

that URL clicking okay we get into that

439:39

Navigator window and there's actually

439:42

multiple tables inside of here one is

439:45

the list of 500 companies and the second

439:48

one is a list of companies that have

439:50

been added and also removed from there

439:53

they also just have random tables in

439:55

there as well just because in the

439:56

internet you're going to have random

439:57

tables like this one of main menu

439:58

contents tools appearances not

439:59

applicable anyway we want to do table

440:02

one I'm going to go ahead and click load

440:04

and now that we have it in here anytime

440:06

we do this probably need to rename it

440:08

appropriately from something like table

440:10

one to S&P 500 in this case and Bam

440:13

scrolling down we can see that we have

440:16

um should be 500 oh a little bit more

440:18

than 500 apparently the list has been

440:21

updated to clear a little bit more I

440:23

don't know why that is but got all the

440:25

DAT

440:28

nonetheless now quick note on if you

440:31

want to actually navigate into any of

440:33

the files and see what I've done

440:36

whenever you go to open it so in this

440:38

case I want to open power query intro

440:40

I'm going to open it up you're going to

440:42

get this of external data connections

440:44

have been disabled do you want to enable

440:46

content in this case yes you want to

440:48

enable all that now the problem you now

440:50

may also have is that it may give you a

440:53

warning that your data source settings

440:54

aren't correct and what do I mean by

440:56

that if I go into data and then under

440:58

get data we're going to see this thing

441:00

here for data source settings and it's

441:04

managing settings for your data sources

441:06

anyway you're going to see these

441:07

locations here these are file locations

441:10

of the data sets and they reference the

441:12

files that are on my computer that's not

441:16

going to be the same for your computer

441:17

it's probably going to be in a different

441:19

location with a different name so here I

441:22

know that this is the data jobs monthly

441:25

folder if I wanted to actually go in and

441:28

update it with the actual location for

441:30

where it is I would go down here select

441:33

change source and then from there select

441:36

browse to navigate to it you're going to

441:38

once again navigate to your course of

441:40

excel. analytics into resources data

441:43

sets and then there's that data jobs

441:44

monthly click open and okay and then

441:48

it's going to update you're going to

441:50

have to do that for this file and all

441:52

the files within a power query and also

441:55

power pivot because your file locations

441:58

are not the same as my file locations

442:00

then after you do that all it should go

442:02

through and refresh but if it doesn't

442:03

you can manually refresh it underneath

442:05

the data tab by clicking refresh

442:09

all the last item to call out is the

442:12

options menu we're going to be going

442:13

into the query editor in the next video

442:16

so we're going to save that for that one

442:17

anyway query option has a lot of

442:20

advanced details in controlling power

442:23

query in this case of showing the query

442:25

Peak when hovering on a query in the

442:27

query's task pane that's sort of

442:29

annoying to me it pops up every now and

442:31

then I'm going to go ahead and unclick

442:32

it but they also have different

442:34

behaviors you contr control for data

442:35

load for the power qu editor the

442:37

security privacy and even Diagnostics so

442:40

feel free to go through this and

442:41

navigate and see what is available to

442:43

actually customize with this I'm going

442:45

to go ahead and click my changes of okay

442:48

and now whenever I go to the queries and

442:49

connections and actually hover over

442:51

something like data jobs Jan doesn't

442:53

just pop up on the screen and sort of

442:55

catch me off guard so I sort of like

442:57

that all right we now got some practice

443:00

problems for you to go through and get

443:01

more familiar with performing Bally ETL

443:04

with power query and loading in some

443:07

different data sources with it with that

443:09

we'll see you in the next one we're

443:11

going to get into the power query editor

443:13

anyway nothing to be intimidated by as a

443:15

lot of the core principles we've learned

443:17

already in Excel are going to be applied

443:19

to this new window so you're going to

443:21

pick it right up on it all right with

443:23

that I'll see you in the next

443:24

[Music]

443:28

one in this lesson we're going to be

443:30

continuing on with power query focusing

443:32

on specifically getting you introduced

443:35

to this power query editor and in order

443:38

to facilitate this we're going to be

443:40

going through or walking through

443:42

actually importing and cleaning up our

443:45

data science job posting data set that

443:47

has over 30,000 rows of data we're going

443:50

to be automating a lot of the steps

443:52

using power query that previously we had

443:55

to use functions and formulas for so

443:57

it's going to be saving us a lot of

443:59

times in order to actually automate this

444:01

data in

444:05

justest for this we're going to be

444:07

starting out with a blank workbook so I

444:09

know we do have this power query editor

444:11

but I don't want you actually editing

444:12

from that that's more for a reference

444:15

now if you do open this file in order to

444:17

reference it as we go along this

444:19

remember you're going to have issues or

444:21

an error saying hey data source isn't

444:23

there remember you need to go in and

444:25

actually select where this data set is

444:27

so under the data tab get data and then

444:30

under data source settings you're going

444:33

to need to update this link or this

444:36

address right here of where you're

444:38

actually accessing the data job salary

444:41

all Excel file this is my location not

444:44

yours got to update it anyway like I

444:46

said we're not going to be using this so

444:47

I'm going to open up a new notebook and

444:50

like before we're going to be importing

444:51

in that data set so we'll go to get data

444:54

from file from Excel workbook you'll

444:57

navigate to the course itself under

444:59

resource under data sets and then we're

445:02

going to be using this data jobs salary

445:04

all Microsoft Excel file go ahead and

445:07

import this in we're going to select

445:09

that sheet one and this time instead of

445:11

doing load or even the load two we're

445:14

actually going to go into transform data

445:17

and this is now going to pop open the

445:19

power query editor and this is where all

445:22

the magic happens behind the scenes in

445:24

order to get our data cleaned up so

445:27

let's go over a quick overview of the

445:29

window itself it's very similar to laid

445:32

out to excel up at the top we have a

445:34

ribbon with four different tabs of Home

445:36

transform add columns and view we'll be

445:38

walking through each one of these as we

445:40

go through this lesson underneath here

445:42

on the Le hand side we have which query

445:44

we're selected to once we're building

445:46

multiple queries they'll start popping

445:48

up underneath each other we can close

445:50

this if we want and make more room is

445:53

right here in the middle is what the

445:55

current step or what the current status

445:58

is is of our data set now yours may look

446:01

a little bit different right now

446:03

specifically I have this column

446:04

distribution enabled underneath the view

446:06

tab which I'm going to go to more in a

446:08

second but anyway it basically outlines

446:10

all the different columns or where we're

446:11

at with the data set itself before we

446:14

finally loaded in now right above this

446:16

area is a Formula bar just like similar

446:20

again to the Excel UI and this has all

446:24

the steps or all the code the M language

446:28

done in this current step if you will of

446:32

actually cleaning up this data set and

446:35

you're like step like what step well

446:37

over here on the right hand side we have

446:39

our query settings and in it we have the

446:42

name of our query and then we have the

446:44

applied steps this lists all the

446:48

different Transformations that we've

446:50

walked through so just a brief walkr the

446:52

first step is source and if I look at

446:55

the formula bar basically what it's

446:56

doing is it's connecting to that Excel

446:59

file with the file path that it has in

447:02

the next step of navigation it's

447:04

basically selecting hey out of that

447:05

Excel file actually select sheet one

447:08

from there to actually load in then from

447:10

there we can see that the headers are

447:13

actually in the first row and not up at

447:14

the top so the next or third step is the

447:18

promote the headers up to the top and

447:20

then finally the last step is change

447:23

type it actually goes through and

447:24

assigns for each of these what data type

447:27

it is so in this case job title short it

447:30

assigns to type text whereas something

447:32

like job posted date it assigns to type

447:35

number which needs to be a date which we

447:36

going to fix that in a little bit down

447:38

at the bottom there's a few statistics

447:39

on this specifically talks about 16

447:42

columns and over 999 rows and it tells

447:45

you when the last preview is downloaded

447:47

anyway if I just wanted to stop here

447:49

with this data transformation if you

447:50

will I would just come up into home go

447:52

into close and load we're just going to

447:54

do close and load two and in this case

447:58

like I'm just going to put in a pivot

447:59

table

448:00

specifically analyzing for job title

448:03

short specifically how many different

448:05

counts or that we have of this we can

448:07

see totaling it all up have around

448:10

32,000 anyway that's a quick overview

448:12

let's actually get into exploring each

448:14

one of those tabs in the power query

448:16

editor so we're going to go back to data

448:18

get data and from there you can just

448:21

select this of launch powerquery editor

448:24

similarly you can also use a shortcut of

448:26

just alt F12 I'm on a Mac so I have to

448:29

press option

448:30

but actually launching this up boom it

448:32

has it with just a shortcut anytime you

448:35

launch it it may be grayed out here so

448:37

we need to make sure that we go in and

448:38

actually select a query that we want to

448:41

analyze and

448:45

transform for this overview we're going

448:47

to start with the view tab because

448:49

mainly I want to get into actually how

448:51

we can use the power query editor for

448:54

Eda and thus save us a lot of time of

448:57

actually having to analyze it in Excel

449:00

in the spreadsheets itself instead we

449:02

can do it right here so going through

449:04

this first thing is you can toggle on

449:06

and off the formula bar I always leave

449:08

the form on so I don't know why that's

449:09

an option next is the data preview I can

449:12

change the font type I can also CH the

449:16

column quality so this is telling us if

449:19

there would be a potential error in here

449:23

or if in this case of job location if

449:25

there's empty values you typically have

449:27

error values whenever the data type

449:29

isn't being being understood correctly

449:31

so in this case job tile short is text

449:33

everything in there is a text column if

449:35

I were to change this to number press

449:37

enter to run I'm going to get errors all

449:39

the way through here because well that

449:41

was text and can't convert text to

449:43

numbers also not sure why but it should

449:45

say 100% error but it's not anyway they

449:47

also have this green bar up at the top

449:50

and you can use this that's what I

449:52

actually prefer so I'm going to unclick

449:53

on The View and changes from the con

449:55

quality because you can actually look up

449:58

here and see and then also togg it so in

450:00

this case for salary or average it looks

450:03

like there's 60% of them are valid and

450:06

40% are empty now remember this is only

450:10

doing the data sets around 30,000 or

450:13

32,000 rows but it's only profiling so

450:16

down here on the bottom column profiling

450:18

based on the top 1,000 rows so that's

450:21

all we're seeing right here if I wanted

450:23

to see all of the data itself now

450:25

depending on how big it is we may not

450:27

want to do this I can select this at the

450:29

bottom and column profiling based on

450:31

entire data set and it's going to reload

450:34

back into here not sure how long it's

450:36

going to take now going over I can see

450:38

there's 22,000 data sets of for data

450:41

points of the salary year where 10,000

450:44

are empty the other thing that you may

450:46

have enabled by now is that column

450:47

distribution to be able to see what are

450:50

the what is the breakdown of distinct

450:53

and also unique values investigating

450:56

what actually distinct unique means I

450:58

went back to the job tile short looks

451:00

like now it's actually picking up on all

451:01

the different errors I'm going to

451:03

actually change this back we don't want

451:04

this to be number for job tile short

451:06

we're going to change this back to text

451:08

and I'm also going to refresh the

451:10

preview by going to that Home tab

451:13

basically refreshing it to get it all

451:15

cleaned up anyway if we recall from our

451:17

previous analysis there's 10 different

451:20

job titles of sat senior data scientist

451:23

data engineers and whatnot and so that

451:26

is the 10 distinct values they're

451:30

distinct because they have repetitive

451:33

values in here like right in here in six

451:35

and 7 data engineer appears more than

451:38

once now if we go over to something like

451:40

job country they have 111 distinct so

451:43

meaning 111 countries that have multiple

451:46

different countries and only 12

451:48

countries that have one value for it or

451:51

one unique value all right the last

451:53

thing in data preview is column profile

451:56

and this is pretty neat right now I'm

451:58

selected on the job tile short column

452:00

it provides one on the left- hand side

452:02

key statistics about the column and then

452:05

two it actually shows a breakdown of the

452:08

value distribution of it so this is

452:10

really helpful in performing Eda if I

452:13

wanted to go through here and actually

452:14

see something so I can easily go in and

452:16

even see something like job country and

452:18

see how United States has the majority

452:20

of the values and then how the different

452:22

other countries Fall underneath that now

452:24

this takes up a lot of room and sort of

452:27

valuable real estate so I find myself

452:29

togging Ling this column profile on and

452:31

off all right last few sections in this

452:34

view tab go to column if you have a

452:36

large data set with a ton of columns you

452:38

can just come down here select the

452:39

column you want to go to and then it

452:41

will navigate you to it parameters this

452:44

is beyond the scope of this course we're

452:46

not going to be enabling parameters or

452:47

even using them so we'll call This na

452:50

next is the advanced editor which allows

452:52

us access to basically the behind the

452:55

scenes of our am uh M language which

452:58

we're going to be breaking down further

452:59

in an upcoming lesson so we're going to

453:01

save that but you can also access that

453:04

from the home menu in advanced editor as

453:06

well lastly is query dependencies

453:09

whenever it gets into complicated ways

453:11

that you're actually building your

453:13

different queries and how they're

453:14

connected to each other this is going to

453:16

come in handy and this case we're

453:18

showing that hey we connected to that

453:20

Excel file on my MacBook and we loaded

453:24

it into a pivot

453:28

table all right next up is query

453:31

settings I'm actually going to go ahead

453:32

and close this out for queries over here

453:35

anyway with the query settings we can

453:37

actually change the name of the query if

453:40

we want to in this case is named sheet

453:43

one I don't really like that I'm going

453:44

to name it something like J jobs and I

453:47

know it has salary data in it so I'm

453:49

going to have salary down here on the

453:51

applied steps like we mentioned this is

453:52

a step through walkth through of each of

453:55

the individual steps that power query

453:57

has taken to actually clean up our data

454:00

set now one thing I will call out in

454:02

this if I need to modify anything so in

454:05

this case if I wanted to modify the data

454:08

source here I could come inside of here

454:11

into the formula bar and edit it I would

454:14

encourage you if you're not familiar

454:15

with the phone of the bar with using

454:17

that or comfortable using it instead

454:19

click click this settings icon over here

454:22

on the right hand side and then

454:24

typically a window will pop up and allow

454:26

you to edit it so I could technically

454:29

change the location of this or change

454:32

what type of file it is the same for

454:34

navigation as well I can basically pull

454:36

back up that navigation window that I

454:38

had before and change the sheet I wanted

454:40

to for the change type this doesn't

454:42

really have a gear icon next to it for

454:44

us to edit so we're about to go through

454:47

and actually change it but if we inspect

454:51

the job posted date we'll see that here

454:54

one it has it underneath the type number

454:56

but then actually looking at the column

454:58

itself it's a number value because

455:00

remember Excel stores ex uh dates as

455:03

number values behind the scene well we

455:05

could convert this to a date by typing

455:07

in date here but you may not be

455:09

comfortable doing that just yet anyway

455:11

with that that's a great segue into the

455:13

Home tab into how actually we can change

455:16

something like a data

455:19

type with the home typ we've already

455:21

seen a lot of things already right we

455:23

saw the close and load too we also saw

455:26

that I can go through and actually

455:28

refresh my query query to make sure that

455:30

it's fully loaded and up to date if I

455:33

have multiple queries I can not only do

455:35

this refresh pery I can go to this

455:37

refresh all and it does refresh of all

455:39

queries we've already seen Advanced

455:41

edited before properties just allows us

455:44

to actually go in and change the name of

455:46

this query if you want to and manage is

455:48

more advanced we'll be dive in that in a

455:50

little bit similar to Under The View tab

455:52

with goto column we also have this

455:54

option of choose column and go do column

455:58

we can also just actually select a

456:00

column if you will so if I wanted to

456:02

actually select job post to date or even

456:05

more than that I can just do that and

456:07

it's going to select it and it's going

456:10

to actually remove all the other columns

456:12

so which is not what we want to do which

456:14

brings us a good point if we want to get

456:16

mid rid of a step all we have to do is

456:18

come over to the applied steps and

456:20

there's a red x mark that will appear

456:23

over any step that you do so I'm just

456:25

going to go ahead and click X here and

456:27

it's going to remove anything that I've

456:29

done moving on to remove columns which I

456:31

think is pretty self-explanatory if you

456:32

want to remove a column you just select

456:34

it and you select remove column

456:36

additionally if I want to remove all

456:38

other columns so in this case job title

456:40

let's say I want to keep that I could

456:41

select remove all other columns and it

456:43

would do that I want to cancel this step

456:45

so I'll click X similarly to remove

456:47

columns we have well keep rows and also

456:50

remove rows and then we have options for

456:52

also sorting our values if we want to

456:55

sort them from a to z or Z to A

456:57

depending on a column so back to job

457:00

post to date maybe I wanted them in

457:02

numerical order I could just click A to

457:05

Z and it would go through and actually

457:06

sort it anyway I don't really want to do

457:08

this I'm going to clear this step as

457:10

well this brings us actually into what

457:12

we want to do of we want to change this

457:15

job posted date to a date time and

457:18

that's we're going to use underneath

457:20

this transform section in the Home tab

457:22

right now this data type as I'm

457:25

selecting this job posted dat it notices

457:27

that it's a decimal number I go to

457:29

something like search location it

457:31

changes to text so what I want to do is

457:34

change this data type of decimal number

457:37

to specifically a date time because

457:40

that's what we have in here we have date

457:41

and time now this popup is going to come

457:44

up if you're doing this underneath the

457:46

step that has changed type already what

457:48

it's noticing is that the selected

457:51

column has an existing type conversion

457:53

would you like to replace the existing

457:56

conversion or basically preserve that as

457:58

a number and add a separate step I'm

458:01

just going to go ahead we're going to do

458:02

replace current but I just want to show

458:03

what it looks like of adding another

458:05

step in this case I converted it in this

458:07

step to a number and then the next step

458:11

I converted it to a date time I don't

458:13

like having a bunch of steps I want to

458:15

make this as concise as possible so I'm

458:17

going to clear that step instead and

458:19

instead this time whenever we go through

458:21

it and select date time I'm going to say

458:24

hey replace current now underneath here

458:27

it updated that job post to date type to

458:30

date time and it's all within one step

458:32

love this similarly to that date time I

458:35

also want to convert the salary or

458:37

average and the salary hour average

458:40

columns right now they're decimal

458:42

numbers which is nothing wrong with that

458:45

but I actually have the option to change

458:47

it to something like a currency in this

458:50

case once again I want to replace the

458:52

current step for that I'm going to do

458:54

the same for salary hour average and

458:57

change that to a currency as well for

459:00

replace current covering briefly these

459:03

other sections in the Home tab first up

459:05

is merge and append we're going to be

459:08

covering an entire lesson on this and

459:10

how we can actually take different Excel

459:13

files and different queries and combine

459:14

them together with this manage

459:17

parameters is outside the scope of this

459:19

course I don't find myself ever really

459:21

doing this so not something we need to

459:22

worry about data source settings similar

459:25

to what we saw outside of the power qu

459:28

in Excel basically the same popup is

459:30

going to come here to allow you to

459:32

change where your data source is and

459:34

then down here at the very end if we

459:36

have wanted to put in a new query I

459:38

wouldn't necessarily have to back out of

459:40

the power query editor I could just come

459:41

in here and select a new source a file

459:44

or database or other source and then

459:46

work through actually importing it in in

459:48

a query sometimes I find myself also

459:50

using this one of enter data say I had a

459:53

simple table that I wanted to input into

459:57

Power query to have I could go through

459:59

and just create that

460:03

table all right next up is the transform

460:06

Tab and this one I feel is maybe

460:08

actually although it looks like a lot of

460:09

options it's probably one of the most

460:11

simplest as you can see we have things

460:12

like text column number column date and

460:15

time columns structured columns

460:16

basically if we have a data type of this

460:19

we're going to go to you can go to if I

460:21

have a number column I want to go to

460:23

this and see what things I could do to

460:24

it such if I could do statistics to it I

460:27

could do rounding to it or I could even

460:30

get information out of it if it's even

460:31

or odd I also have this section on any

460:34

column that basically applies to any

460:36

column this is allows us to one like we

460:39

saw in the Home tab actually convert the

460:41

data type of something but also even

460:43

more advanced Transformations such as

460:46

pivoting and unpivoting columns which

460:49

we're going to be diving deeper into in

460:50

the next lesson on Advanced

460:52

Transformations and finally we have this

460:54

section on tables which just does more

460:56

of generic things to this data set such

460:59

as if I wanted to actually go through

461:01

and count the rows on this could and I

461:03

find out I have 32,000 different rows on

461:06

this anyway I actually want to transform

461:08

a column of this specifically this job

461:11

via column as you notice from here that

461:15

all these different job platforms have

461:17

via and then a space right at the

461:20

beginning of it I want to actually

461:22

remove that so in order to do this I

461:24

make sure that one job via column is

461:26

selected I notice up here in the any

461:29

columns it has the data type of text now

461:32

there are a few options in underneath

461:34

the text column section for like

461:36

splitting columns I could split it by

461:39

this half and then delete that via but I

461:41

find actually the easiest way to do this

461:43

is just go through this replace values

461:47

and we're not going to do replace errors

461:48

we're going to just do replace values

461:49

itself and we find a value in here in

461:53

this case we want to find VIA with a

461:56

space and we want to replace it with

461:58

well nothing if I wanted to go into

462:00

advanced options and I have a few

462:02

different selections available but

462:03

neither of these applicable does so

462:05

we're going to just go ahead and click

462:06

okay and Bam now we have these job

462:09

platforms cleared up now we've been

462:12

going through this and keeping the names

462:14

of these steps the same but sometimes I

462:16

like to be more descriptive in when it's

462:19

not a general tyag now it named this new

462:22

Step replaced values I may actually do

462:25

that a few times and I want to be able

462:27

to whenever I go back to this actually

462:28

be able to identify what steps did what

462:31

in this case change type promoted

462:32

headers navigation Source those are all

462:35

only usually typically done once so I

462:37

know what that means however however for

462:38

this one I don't know so I'm going to

462:40

right click it and go to rename and I'll

462:42

say this is replaced via in job via

462:46

which is much more descriptive in my

462:50

opinion all right only one more tab to

462:53

cover and that is the add column with

462:56

transform we transformed a current

462:58

column with ADD column we're adding

463:00

additional column to this similar

463:02

transform it has these options for text

463:05

number and also date and time so very

463:08

familiar features with this so let's say

463:10

I wanted to extract the month and the

463:13

year out of the job posted date column

463:15

basically I want to Callum for month and

463:17

I want to Callum for Year anyway

463:19

previously we learned with that

463:20

transform tab if I were to come into

463:21

here under date time and then select

463:24

something like month it's going to

463:27

transform this tab so it's going to get

463:30

rid of the contents of the job posted

463:32

date is not necessarily what I want I

463:35

want a new column so I'm going to

463:36

actually get rid of this Stu so with ADD

463:39

column what this does is with that job

463:41

posted date column selected I select

463:44

date in this case I want month I could

463:46

do start a month end of month day of

463:48

month whatever I just want the month

463:49

itself and then inserted month is pretty

463:52

descriptive I however don't like the

463:55

name of this so I could come in here

463:57

this is an option and change I double

463:59

clicked on this and name this job posted

464:03

month and then press enter now with this

464:07

I'm going to get a renamed columns here

464:11

so now I have two steps of this month

464:14

was inserted into this and then we

464:16

rename the column I would encourage you

464:19

to minimize the amount of steps you have

464:21

because these queries can get quite long

464:23

in this case I'm going to delete this

464:25

rename column go back to this inserted

464:27

month if we actually re read this you

464:30

don't actually need to understand what's

464:32

going on much in here but I can see

464:35

basically that we have this month in

464:37

quotation marks and this is named month

464:41

so I basically can reason that this is

464:44

probably the new column title of this so

464:48

instead of using month I'm just going to

464:50

edit this in the formula bar to job

464:52

posted month then I'm going to click at

464:55

the end and press enter and now all

464:58

within one step I inserted that month

465:01

and renamed it as well if you're not

465:04

comfortable doing that feel free to go

465:06

through that next step of actually

465:07

double clicking this and actually

465:08

changing it but I would encourage you if

465:10

you can actually try to mess around with

465:11

the formula if you make a mistake it's

465:14

pretty simple to just X out of that step

465:17

and then redo it again so there's no

465:19

harm to your actual data set now

465:21

similarly if I wanted to create that job

465:23

posted year column I could just go

465:25

through here select year whether I want

465:28

start year end of year year itself once

465:30

again it inserts year and then I would

465:32

want to change the name of this and

465:33

change this to job posted year and then

465:36

click enter and Bam now we have it I

465:39

don't actually need this all these are

465:41

from 2023 I don't actually this is not

465:44

going to provide any useful data for me

465:45

so I'm actually going to delete this

465:49

Stu all right I want to do one last

465:51

transformation before we actually load

465:53

this and going to actually visualize

465:55

this so we have our salary year average

465:58

column and then also want to compare

466:00

this to the salary hour average column

466:05

but right this is on a yearly basis this

466:07

is on an hourly basis what we could do

466:10

is do a conversion to our salary hour

466:14

average column to get it to an equal

466:16

value or comparable value to our yearly

466:18

value meaning we could put the number of

466:20

hours in a year multiply it times this

466:23

value and from there get what would be

466:26

the expected yearly salary for this hour

466:30

data so I could do this via the

466:32

transform tab right going into that

466:34

number column under standard we want to

466:37

actually multiply and then there's 2080

466:41

hours in a year working hours for 40

466:44

hours of work week I could go through

466:47

and actually do that and that's going to

466:49

update this column itself but remember

466:52

we probably want its own column so I'm

466:55

not going to use that instead we'll go

466:57

to add column with this

466:59

hour average column selected select

467:02

standard multiply put in those hours of

467:05

2080 and then click okay once again I'm

467:08

going to rename this I can see that this

467:10

multiplication column is titled this via

467:14

in this step right here so I'm going to

467:15

rename it to salary hour adjusted and in

467:20

this case I'm going to also rename this

467:21

step to adjusted hourly salary to yearly

467:26

now I'm sort of a stickler for keeping

467:27

my data set in order right now I have

467:29

this job posted month and it's sort of

467:32

right away from it's pretty far away

467:34

from my job posted date I would actually

467:36

want to move it right next to it so

467:38

there's a couple options I can do to

467:40

move it I can select the column and then

467:43

come up here to the transform Tab and

467:45

move go left right to beginning to end

467:49

or I can actually just take it and then

467:52

drag it and this is taking forever it's

467:54

like paint dry but find where I want it

467:57

boom plant it in and then inserted the

467:59

step of reordered columns I'm going to

468:01

do the same thing with salary hour

468:05

adjusted and put it right next to salary

468:07

hour average and both of these done with

468:10

one step of reordered columns so I'm

468:13

fine with

468:16

that so now let's actually get into

468:19

analyzing this specifically I want to be

468:21

able to analyze and compare this salary

468:23

hour adjusted column that we just

468:25

created compared to the salary year

468:28

average so going back to home I'm going

468:30

to close and load this in we have this

468:33

previous analysis that we did before

468:35

doing Eda on the jobs actually want to

468:38

create my own from scratch all right so

468:40

back on sheet one we can see our queries

468:41

connection specifically that data job

468:43

salary remember the data tab you can go

468:46

into that and it can toggle on all that

468:48

queries and connections anyway we want

468:50

to insert I want to analyze that hourly

468:53

adjusted salary so I'm going to come in

468:55

to create a pivot chart we also do pivot

468:59

chart and pivot table at the same time

469:00

anyway when this pops up for pivot table

469:02

or pivot charts we want to we're not

469:04

going to select a table AR range because

469:08

this is a power query connection if you

469:10

will we're going to use this external

469:13

data source and we're going to say

469:15

choose connection what connection do we

469:17

want to use for this specifically I want

469:19

to use that DOA job salary so go ahead

469:22

and click that and open and we're going

469:24

to insert it into the existing worksheet

469:27

so now the pivot table set up for us go

469:28

forward to do one quick note you may be

469:31

tempted say if we went back to jobs Eda

469:34

to rightclick this and then go load to

469:37

and let's say hey I wanted to create a

469:39

new pivot chart well the problem is is

469:42

going to then get rid of this pivot

469:45

table that we previously created so you

469:48

don't want to necessarily if you want to

469:50

keep this you don't want to actually do

469:52

that back to the pivot table itself

469:54

you'll notice now because we have these

469:55

queries and connections but you can

469:57

toggle between the two over here on the

469:59

right hand side anyway what I want to

470:01

compare is that salary hour adjusted to

470:06

that salary year average right now it's

470:09

doing sums we don't want that we do

470:12

eventually we're go to Value fail

470:13

settings we're going to do average here

470:15

we're eventually going to do median I

470:17

promise you but we're going to STi for

470:19

average for the time being I'll adjust

470:21

both of these to be of average then I'm

470:24

not really liking the formatting here I

470:26

know we adjusted it as currency back in

470:28

the the power query but this is the one

470:30

data type that I find doesn't actually

470:34

follow through in actually making into

470:36

the correct data type when you import it

470:38

into Excel so you do need to go back

470:40

still and actually convert it into the

470:42

correct thing anyway we're seeing that

470:43

the hourly salary is much less than the

470:48

yearly salary and moving this over we

470:50

can also see this via visualization this

470:53

doesn't really show as much I would

470:55

rather look at this when compared to job

470:58

type

470:59

so I'm going to go ahead and grab job

471:00

title short and throw it into the axis

471:04

now closing out of this and then closing

471:07

out of this on the side we can now get a

471:10

better view of this I'm not liking the

471:13

format of this pivot chart specifically

471:15

I'm going to go in here design under

471:16

change chart type and change this to a

471:19

bar chart I feel like it's going to be

471:21

easier to read yeah it's a lot easier to

471:23

read also for these visualizations I'm

471:25

going to rightclick this and I'm going

471:27

to say hide all field button so that

471:29

make this easier to view and I'm going

471:31

to go ahead and stick The Legend at the

471:34

bottom okay we're off to a good start

471:37

other things I want to do to clean this

471:39

up is oh my goodness this is so long I'm

471:41

going to change these column titles to

471:44

hourly adjusted salary and then yearly

471:47

salary additionally I want to sort this

471:49

a little bit better specifically from

471:51

high to low so under sort options more

471:54

sort options I'm going to go into

471:56

sorting this as sending based on the

471:58

year L salary from high to low sorry

472:01

that's actually descending selecting

472:03

year salary clicking okay no it was

472:06

right the first time it's ascending okay

472:08

this is looking good you know also I

472:09

don't like having different colors I

472:11

like actually going with a consistent

472:14

theme so going into design change colors

472:18

I'll change this to this monochromatic

472:20

pallette 8 and Bam we now have our final

472:24

visualization that we use power query to

472:26

basically ingest all our data in clean

472:29

it up create this new column of hourly

472:33

adjusted salary perform an analysis in

472:35

Excel to average it and we can see that

472:39

consistently the hourly salary is well

472:43

below that of the yearly salary so I

472:46

guess it pays to have a salary job all

472:49

right we have some practice problems for

472:50

you to now go through and test out all

472:54

these different features and get more

472:55

familiar with the power query editor in

472:59

the next lesson we're going to be going

473:00

into advanced Transformations and Diving

473:03

deeper specifically in analyzing skills

473:06

and using power query to actually clean

473:08

it up so where we can actually analyze

473:10

skills with that see you in that

473:15

one all right welcome to this lesson

473:17

we're going to continue on with power

473:19

query specifically focusing on using

473:22

more advanced Transformations and for

473:25

this we're actually going to get into

473:27

analyzing those skills and being able to

473:30

put them on a graph and actually

473:31

visualize what are the top skills of

473:34

data nerds now if you recall way back in

473:37

the functions and formulas chapter when

473:40

we went over text functions we did a

473:42

little bit of text cleanup to clean up

473:44

this column and then plot it but we were

473:46

only able to do that with around 20 rows

473:49

now with the power of power query we're

473:52

actually going to be able to clean up

473:53

all these values and be able to

473:56

visualize it for all 30,000 job post

473:59

so let's jump in if you want to you can

474:01

continue on from that worksheet that we

474:03

used in the previous lesson and just

474:07

make sure that you do go through and

474:09

actually save it before you continue on

474:12

however if you got lost in the way or

474:13

you just don't have that file anymore

474:14

feel free to use the lesson or the file

474:17

from the last lesson of power query Eder

474:20

once again you don't want to be using

474:21

the actual one working cuz that has the

474:23

final results we're going to want to

474:24

work with that one and this has all the

474:26

different work that we did it also has

474:28

some some additional analysis whenever I

474:30

looked at plotting it over time to see

474:33

if how the salary of yearly versus

474:35

hourly

474:38

compared anyway let's get into editing

474:41

this and we can get to the power query

474:42

editor by going up to get data launch

474:45

power query or pressing alt F12 once it

474:49

loads and need to click on the query

474:50

that I actually want to look at and I'm

474:52

going to close this or minimize this the

474:54

first thing that I want to do is start

474:56

an index column on this data set because

475:00

in general whenever you have a source

475:02

data set or a fact table like this is

475:05

you want to have an index associated

475:08

with it yeah these row numbers are good

475:09

but that's not good enough and we'll be

475:11

using it more in the power pivot chapter

475:13

but it's good practice to start it now

475:15

so moving over to the add column tab I'm

475:17

going to go to index column it allows us

475:20

to start from either zero or one I'm a

475:22

coder so I like from zero now Pro tip I

475:25

want this index at the front now I could

475:27

go to to transform and then move and

475:30

then move this to the beginning but

475:32

remember we did this reordered columns

475:34

right here so what I'm actually going to

475:35

do is take this added index put it

475:38

before reordered columns now that the

475:41

reordered columns is right there

475:43

whenever I select this index and move

475:46

this over to beginning it's going to be

475:48

included in part of this step of all of

475:51

our column reord so I don't have once

475:53

again multiple different reordered

475:54

columns

475:58

all right in order to clean up this job

476:00

skills column we're going to end up

476:02

being putting this uh these skills right

476:06

now they're separated by column inside

476:07

of this list we're going to be breaking

476:09

them up into their own individual rows

476:12

and because we're breaking this up into

476:13

different rows this now is going to put

476:16

for this Row one value here this is

476:18

going to make 1 2 3 4 5 6 7 this is

476:21

going to make seven different rows of

476:23

data this is going to mess up anytime we

476:25

want to analyze anything because imagine

476:27

if you have like a salary data it's then

476:29

going to appear seven times so the main

476:31

point of explaining that is we want a

476:33

new query to actually populate and

476:37

actually break these skills out into

476:39

their own separate rows so in order to

476:41

create a query or another query right

476:43

now we have queries one to create

476:46

another query from this we have two

476:48

options and that's underneath Home tab

476:50

they have manage and we can either

476:54

delete a query which we're not going to

476:55

do we can either duplicate it or

476:57

reference it I can also get to this by

476:59

just right-clicking the query and it

477:01

also has these of duplicate and

477:03

reference let's actually look at both of

477:05

those starting with duplicate first so

477:08

I've created my duplicate query and as

477:10

you can see it basically has a duplicate

477:13

of the original query nothing really has

477:17

changed from it now this is cool if I

477:19

want to walk through all the different

477:21

steps again and I wanted to have it in

477:23

this new query but I actually like this

477:27

other option so I'm going to go to data

477:28

job salary this CL I'm going to go down

477:31

select reference okay this query this

477:34

one named three is referencing data job

477:38

seller and it only has one applied step

477:41

if we look at the applied step all it is

477:43

doing is referencing the data jobs

477:46

salary so this first query right now and

477:48

populating it for us and this is really

477:50

good because say now I make changes to

477:53

the original query such as say I want to

477:56

go through and I don't want any any more

477:58

of the hourly data in here I only want

478:00

the yearly data so I filter down to only

478:02

have the yearly data so now it's

478:05

filtered these rows for the yearly data

478:07

don't worry we're actually not going to

478:08

do this I'm going to delete this Stu but

478:09

anyway if I go to that duplicated query

478:12

the one with the three at the end this

478:15

one only has year values in it this I

478:19

can verify is 100% yearly by looking

478:22

either the column distribution or the

478:24

column profile everything is your anyway

478:27

we don't actually want to do that step

478:29

so I'm going to go back to this original

478:31

query clear the filtered rows and once

478:34

again it's going to just clean this back

478:36

up to have two distinct values so

478:38

compare checking the S rate yearly and

478:39

also hourly okay so we like the

478:42

reference for our case cuz I like we may

478:45

make changes to the original one so I'm

478:47

going to delete this number two because

478:49

remember that was the duplicate and

478:50

we're going to keep the number three one

478:52

which was the reference we're also going

478:54

to be doing all our alterations on the

478:56

skills on this one so I'm going to to

478:58

rename this one data jobs

479:03

skills so with this new query data jobs

479:06

skills let's actually get into cleaning

479:08

up this column of data of job skills

479:11

specifically we're going to be

479:12

separating this into each of these

479:15

skills into the new rows by this comma

479:18

delimiter but we need to remove a few

479:21

things from this specifically this has

479:23

brackets around it and it also has

479:24

single quotes we don't need any of that

479:26

we need to remove it so going to that

479:28

transform tab we're going to go into

479:30

replace values and we've done this

479:32

before so for the value defin I'm going

479:34

to just start with the first square

479:36

bracket we want to replace with nothing

479:38

I'm going to click okay additionally we

479:40

want to replace the other bracket as

479:43

well replace it with a blank and then

479:45

finally we want to replace that single

479:48

quote as well also I'm going to just

479:51

rename these all next thing we going to

479:53

do is actually split these columns on

479:55

this delimiter of a comma so under

479:59

transform we can go here to split column

480:02

it has a few different options by

480:03

delimiter number of characters by

480:05

positions we can go to by delimiter I'm

480:09

going to select that for this we're

480:11

going to use a comma delimiter because

480:13

there's multiple different options you

480:14

could potentially use for this we want

480:16

to split at not just the leftmost but we

480:18

want to split at each occurrence there's

480:21

no quote characters in here we removed

480:23

all the quote characters so I'm going to

480:24

click none and then click okay so now we

480:28

just split these skills into let's see

480:31

how many different columns we have here

480:33

looks like we have up to 24 skills for

480:37

all these different skills that we have

480:39

so now what we need to do to get all of

480:41

these if you will skills within a single

480:45

column we need to unpivot them but the

480:49

one issue right now so I have all these

480:50

skills right here but we also have all

480:52

these other columns right here I don't

480:55

really care about all them just I don't

480:57

really care about around too much I want

480:59

to mainly just analyze job title short

481:01

and indexed so what I'm going to do to

481:03

make this easier because I need to

481:05

basically select which columns I want to

481:07

remove or which ones I don't want to

481:09

remove in this case so what I'm going to

481:11

do is go back to source and this one has

481:15

before we actually broken up the job

481:17

skills so I'm going to select job skills

481:20

hold down control and then from there

481:22

select job title short and also index

481:25

and then underneath the Home tab we're

481:26

going to go to remove call s what we're

481:28

going to do remove other columns

481:30

basically going to keep those three

481:32

columns that we have now we are doing

481:34

this in the applied steps after that

481:35

first step of source so it's asking hey

481:37

do we want to insert this step yes we do

481:40

and so now we've limited it down to

481:42

those three columns and Bam now whenever

481:45

we go down here down to that last step

481:48

of change type we can see that we have

481:50

all our different job skills and then

481:53

over on the right hand side we have our

481:55

index and our job tile short which I

481:58

don't really like the order of this I'm

481:59

actually going to go back to reorder

482:01

this over here I'm going to just take

482:04

these column values and then put them in

482:06

this order of index job title short and

482:09

job skills so now we actually get into

482:12

unpivoting these job skills columns

482:15

basically making all these job skills

482:17

into one column so I'm going to select

482:19

instead of selecting all the job skills

482:21

column I'm actually going to select the

482:22

opposite holding control select the

482:24

index and job title short and I'm going

482:27

to go into to transform tab into unpivot

482:30

columns and for this one once again

482:33

we're going to use the other we want to

482:34

unpivot other columns and go ahead and

482:37

do this all right so what we do here we

482:40

now have this new column of attribute

482:42

and value attribute if we go back that

482:46

is just the name of the column that was

482:48

created previously and then the value is

482:51

what was in the cell itself and that's

482:53

filled with all the skills so personally

482:56

I don't really care for use of this

482:58

attribute so I'm going to go ahead and

483:01

just remove this column by right

483:02

clicking and selecting it additionally

483:04

I'm going to go back up here and I don't

483:05

want this to be named value so I can go

483:09

in and inspect this under unpivot other

483:11

columns I can see in here that it

483:14

renames these columns attribute and

483:16

value in this case I don't want to be

483:18

value like I said I want to be job

483:21

skills clicking enter boom renamed it to

483:25

job skills and then in here it is job

483:27

skills

483:28

now one thing that's bothering me real

483:29

quick before we continue on to actually

483:31

visualizing this data is this column

483:34

here typically I like to name things

483:36

something like job uncore whatever it is

483:38

in this case index I want to Name jobor

483:41

ID but if you recall back we created

483:43

this back in this data jobs salary

483:46

portion especially here under the step

483:48

of added index I want to change this

483:51

from index as we've done before going in

483:53

and renaming it to job ID however

483:57

whenever I do this press enter this is

483:59

going to break my queries and this is

484:02

going to happen to you anytime you're

484:03

manipulating it so I think we need to

484:04

get familiar with it so if I go to the

484:06

next step of reorder columns we're going

484:08

to have this expression error the column

484:11

index of the table wasn't found duh

484:14

because we named it job ID in the

484:17

previous step instead of index but this

484:19

step is still the same so what I can do

484:21

is come in here change index to job ID

484:25

press enter and Bam that updates but

484:28

then now going to data job skills we're

484:31

going to have the same thing you're

484:33

going to notice with this one right the

484:35

column index the tail wasn't found index

484:37

so same error message what we want to do

484:39

you can do is go to error it's going to

484:41

go to the first occurrence of that error

484:43

in this is trying to reference index we

484:46

if you call back from if we go to the

484:47

first step of source we expect it to be

484:50

called job ID now because we renamed it

484:52

right so I'm going to change this to job

484:55

ID and then scrolling through the

484:58

applied steps to see whenever we get to

485:00

our next error if there is an error and

485:03

that's unpivot other columns

485:05

specifically they have job title short

485:07

and index I don't want index here I want

485:09

job ID and now bam now we have it

485:12

cleaned so I should have done that job

485:14

ID but that was actually good

485:15

troubleshooting to walk through that you

485:17

may

485:20

encounter so let's actually get into

485:23

visualizing this so we're going to go to

485:24

home and we're going to close and we're

485:26

going to close and load

485:28

now it's popping up as a table but we

485:30

actually want to analyze this I don't

485:32

really care to have it as a table so I'm

485:34

going to right click it and I'm going

485:35

click load to specifically we're going

485:37

to go to a pivot chart and we'll insert

485:41

in the existing worksheet because we're

485:43

going to get rid of that data yes

485:44

there's going to be possible data loss

485:45

we understand that so I'm going move

485:47

this chart off to the side select inside

485:49

the pivot table and we want to analyze

485:51

the job skills so I'm going to take the

485:53

job skills put them in rows and then the

485:55

job skills also in the values to to

485:57

count up the values then also I'm going

485:59

to sort them I want to sound them from

486:02

high to low so I went to more sort

486:04

options um we're doing a descending

486:06

order count of job skills so now there's

486:10

a ton of different skills in here but

486:12

want you to inspect this if you notice

486:15

one these skills have sometimes have

486:18

spaces in the front of them basically we

486:21

didn't do a full cleanup of this so

486:24

that's why we have python twice in here

486:26

is cuz this one has a space of it so

486:28

opening up the power query editor by

486:30

playing by pressing alt F12 so

486:33

underneath the data job skills query I'm

486:36

going to go ahead and we want to do a

486:38

text

486:39

transformation specifically if we look

486:41

underneath this underneath for format we

486:42

can change this to lower case upload

486:44

case capitalize each word we're going to

486:46

do trim which removes leading and

486:49

trailing white space from each of the

486:51

cells in the selected cell from there

486:53

we'll go back to home close and load

486:55

this and now it's going to be reloading

486:58

the data and those duplicate values are

487:01

now going to be removed now there's a

487:03

lot of skills here so I really only want

487:05

to see the top 10 so I'm going to put a

487:08

filter on here go into value filters and

487:12

top one specifically want to see the top

487:14

10 items by count of job skills also I'm

487:18

going to rename this to skill count and

487:21

because these are text values down here

487:23

I'm actually going to change this from a

487:25

column chart going to change chart type

487:28

into a bar chart instead clicking okay

487:32

boom and then with this obviously it's

487:34

not sorted from high to low that's how I

487:36

want actually to sort it so I'm going to

487:38

go in here back underneath our more sort

487:41

options Chang this from descending to

487:43

ascending and the good thing about this

487:46

is we still have that top 10 filter on

487:48

it so it's still going to apply this and

487:49

have the top 10 values on there first

487:52

last little clean up I'm going to hide

487:53

all field buttons I'm going to get rid

487:55

of this Legend right here and and then

487:57

I'm going to rename this to what are the

488:00

top skills of data

488:04

nerds now let's say that I'm frequently

488:08

referencing the top 10 skills as we have

488:12

right here and instead of having to

488:14

populate this every single time I want

488:17

to actually create a own or create a

488:19

query for this so opening power query

488:22

going to alt F12 I could do the same

488:25

analysis inside of power query query and

488:27

get this into its own table to be reused

488:30

but for this I don't want to use this

488:31

data job skills query instead like we

488:34

did before I'm going to create a new

488:35

query we're not going to duplicate this

488:37

instead we're going to reference it so

488:39

now it's Unique and distinct and I'll

488:41

rename this data jobs skill count

488:45

because we're get the top 10 and their

488:47

Associated count so in order to do this

488:49

analysis to find what is the count of

488:51

all these different skills we want to do

488:54

a group buy and it's right here under

488:56

transform form under that Home tab and I

488:58

can do group by which group rows in the

489:00

table based on the values in the

489:02

currently selected column we're going to

489:04

be forming a basic Group by we're using

489:06

that job skills column I could change it

489:08

to another column if I wanted to and

489:09

that new column name is going to be

489:10

skill count operation we're going to be

489:13

counting the rows we could do any other

489:14

type of aggregation as well if we had

489:16

numerical data we could do average

489:18

median min max whatnot go ahead and

489:20

click okay so we've done this

489:22

aggregation now the next thing is I just

489:25

want to get the top 10 values but before

489:27

to do that I need to actually sort this

489:30

in descending order right now I can tell

489:32

looking into the numbers this isn't

489:34

necessar although it looks like it isn't

489:35

right so clicking the arrow up at the

489:38

top I'm just going to say hey sort

489:39

descending and then we want the top 10

489:42

values so underneath the Home tab under

489:43

keep rows I'm going to have keep top

489:47

rows and it's going to prop me how many

489:49

number of rows do I want to keep 10 in

489:51

this case I want the 10 values and now

489:53

from here all I got to do is close and

489:55

load this into its own separate query

489:58

and Bam here we have it and so if I

490:01

needed to reference the top 10 skills

490:03

any time all I would have to do is just

490:04

reference this query and I wouldn't have

490:06

to like we did last time go through this

490:08

full analysis so power of query is

490:10

really great at automating some

490:12

repetitive analysis and having it just

490:14

ready for

490:17

you all right last little cleanup if we

490:20

look at these skilled names they're not

490:22

formatted correctly specifically if I

490:24

look at something like SQL I expect to

490:26

be all capital letters SQL capital

490:28

letters python I expected to be Capital

490:30

At the beginning python so we're going

490:32

to go through and actually fix this so

490:34

that way whenever we present our data to

490:36

someone it doesn't look like a hot mess

490:39

so opening up the power query menu by

490:41

pressing alt F12 we're going to go into

490:44

the data jobs skills query specifically

490:47

on that last step on and we're want to

490:50

alter the job skills column so the first

490:52

thing I want to do with this text

490:54

cleanup the easiest thing looking at

490:56

this is we just need to capitalize the

490:59

first letter of every single word and

491:02

then from there we'll go through and

491:03

actually fine-tune it to capitalize in

491:06

case of SQL capitalize all letters we'll

491:08

have to put in special case for this

491:09

anyway if you recall from before we have

491:11

that transform format and they have this

491:14

capitalize each word we're going to do

491:17

that the next thing though the more

491:19

complicated one is we're going to go

491:20

into add column and we're going to add a

491:24

conditional column so what we're going

491:25

to do is go through we're going to keep

491:27

the the name of custom column cuz we're

491:29

technically going to be since we're

491:30

adding a column we're going to have to

491:31

go and delete this job skills column

491:33

once create this new one I don't want to

491:35

name a job skills right now going to

491:36

call MK anyway what we want to do is we

491:39

want to select the column that we want

491:40

so if job skills equals in this case we

491:43

expect to equal something like SQL we

491:46

want the output to equal SQL then if we

491:49

want to add more conditions or Clauses

491:51

to it we go to add Clause once again I'm

491:54

going to select job skills and I'm going

491:56

to put something like

491:57

powerbi it had a lowercase ey at the end

492:00

I want the powerbi to be fully

492:02

capitalized at the end I also went

492:04

through and added some other ones such

492:06

as AWS gcp no SQL and SAS most all these

492:11

required them to just capitalize fully

492:14

except for the no SQL one then what do

492:17

we want it to be if it's not any of

492:18

these conditions well we'll add this

492:19

else clause and we want it to be

492:23

basically the results of an entire

492:25

column we want it to be whatever it is

492:27

already in the job skills column I'm

492:30

going to go ahead and click okay so now

492:32

we have this cleaned up data set as well

492:36

with nice looking names now if you want

492:38

to if you're going through and finding

492:40

anything in here that you want to clean

492:41

up feel free to add to that conditional

492:43

column statement those are the ones I'm

492:45

just going to go for right now anyway

492:46

because we added this new column and I

492:48

don't really know an easy way to do this

492:51

without actually creating this new

492:53

column we need to now go ahead and

492:55

remove job skills and rename custom so

492:59

going to the Home tab I'm going to

493:01

remove column I'm going to remove the

493:03

one that's selected and I'm going to

493:05

renames custom to job skills and

493:09

conveniently because we're using that

493:11

same name and just replacing it if I go

493:13

to the data jobs skill count that one

493:16

because it references this one will also

493:19

get updated and all those values in

493:21

there are updated as well anyway let's

493:24

go ahead and close and load and inspect

493:27

this is our previous pivot table and

493:30

pivot chart that we analyzed it's now

493:31

going through and loading all the data

493:34

and now we have it updated with all that

493:36

correct formatting for those different

493:38

data points one last thing before we go

493:41

this is generic these top skills of data

493:43

nerds tall data nerds and that is using

493:45

the data job skills query which has the

493:48

job title short column in it so we can

493:51

actually visualize this for a certain

493:53

job by going into pivot chart analyze

493:56

I'm going to go into insert slicer

493:58

specifically we're going to look at job

493:59

title short I'm going to put it over

494:01

here and then as usual I'm going to

494:04

rename it real quick to job title and

494:07

now let's say we want to analyze

494:08

something like data analyst we can see

494:11

that SQL is the top skill but Excel is

494:15

in second place followed by python

494:17

Tableau and SAS what about for business

494:20

analysts very similar in that sqls top

494:23

and then Excel is in that second place

494:25

so really unique and showing the

494:27

importance of excel Within These skills

494:29

and pretty meta that we used Excel to

494:32

find this out all right now it's your

494:34

turn to give it a shot you have some

494:35

practice problems to go through and get

494:38

more familiar with doing these Advanced

494:40

Transformations specifically pivoting

494:42

unpivoting and then also Group by all

494:45

right with that I'll see you in the next

494:47

one we're going to be diving into append

494:49

and merging queries specifically going

494:52

to be doing this with that skill query

494:54

that we did previously all right see you

494:56

there

495:00

let's now get into how to perform a pend

495:04

and also merges and so the first portion

495:07

of this lesson the easiest portion of my

495:09

opinion is going to be a pend

495:11

specifically going back to that Excel

495:13

sheet where we had all those different

495:16

uh sheets for the months of the year and

495:19

they're job posting on each because all

495:20

these data sets are of the same format I

495:22

have the same columns we're going to be

495:24

able to append all these together and

495:26

get what is our final data set of all

495:28

30,000 rows if you recall each month had

495:31

around 3,000 postings so that's how we

495:33

get to that value from there the primary

495:35

focus of this lesson will then shift to

495:37

merge for this we're going to be

495:39

combining our two queries that we built

495:42

previously one which was our original

495:45

data set so we titled that one data jobs

495:47

salary and then that new query that we

495:50

created in the last lesson on the skills

495:53

so data job skills we're going to be

495:55

merging those two together

495:57

and this will allow us to do some pretty

495:59

interesting analysis specifically now

496:01

that we've merged those we'll be able to

496:03

see based on a skill what is the

496:06

expected salary and we're going to build

496:08

a visualization for that for the top 10

496:10

skills now merge unlike a pend is a very

496:14

complex operation mainly because there's

496:17

a lot of different types of merges

496:19

specifically there's six type of merges

496:21

in Microsoft alone so we're going to be

496:23

walking through each one of those so you

496:25

understand the differences and know

496:27

which one to use when for this first

496:29

append example we're going to be using

496:31

this data job salary monthly data set

496:35

and just as a refresher this contains

496:37

everything for in this case I'm selected

496:39

on the January sheet down here and this

496:41

has all the January data which has

496:43

around 3,100 rows for this and we have

496:47

each one of the months for the year

496:52

here anyway let's use power query to

496:54

append all these together because

496:57

previously before you knew about this

496:58

you'd have to go through and actually

497:00

copy and paste all these different

497:03

options right here and then put it into

497:04

a new sheet doing this 12 times is a hot

497:07

mess so since this is only a simple

497:09

example that we're not going to use

497:10

later on I recommend just opening up a

497:13

new workbook for this now coming into

497:15

the data tab I can come down to get data

497:18

and they do have this option right here

497:20

for Combined queries merge and also

497:22

append but this is for append two

497:25

queries from within in this workbook

497:27

it's basically assuming you've already

497:29

imported it in so instead what we need

497:32

to do is actually go to from file and

497:34

actually start our first query of

497:36

connecting to that Excel workbook with

497:38

all those different sheets navigating to

497:40

the course underneath resources data

497:42

sets and then here down on data job

497:44

salary monthly I'll select that select

497:46

Import in the Navigator we can see all

497:48

the different sheets that are available

497:50

we want to actually do enable this of

497:52

select multiple items and then go

497:54

through and select all the items with

497:56

all these loaded we're going to then

497:58

shift into not just loading it we want

498:00

to actually go into the power query

498:01

editor so I'm going to select transform

498:02

data and it's going to start by loading

498:04

each one of those sheets and just going

498:06

to be naming each one of the queries

498:08

respectively after those sheets with

498:10

power query editor launched we can see

498:12

over here in the left hand pan all 12 of

498:15

those queries for each of the months so

498:17

these are all their separate own queries

498:20

because of that we need to now move into

498:22

actually appending them and make it one

498:25

final query that we can actually export

498:27

into or import into Excel so underneath

498:29

the Home tab they have the option for

498:32

combine append queries they have appen

498:34

queries and append queries is new with

498:36

the January query selected I'm going to

498:39

go to append queries and for this I can

498:42

say either do two tables and specify the

498:45

table I want to do we're going to do

498:46

three or more cuz we want to do all of

498:48

them with them all selected I'll now go

498:50

through and click okay to append now

498:52

this inserted a step of appended queries

498:56

inside inside of that January query so

498:58

now that January query is all those

499:01

different data sets so I just want to

499:03

verify that I got all the data in here

499:05

right now if we scroll down well I'm

499:06

just going to show it right here we're

499:07

only showing column profile based on the

499:09

top 1,000 the fastest way to actually

499:12

find this out is just go to the

499:13

transform Tab and go to count rows which

499:17

it tells me there's 36,000 rows which

499:19

it's a few thousand too many and if I go

499:22

back into the appended query option and

499:25

actually look into it I can see in the

499:27

formula bar we have August in here I

499:29

accidentally selected it twice so I'll

499:31

go ahead and delete it and then look at

499:33

the counted rows that's actually what I

499:35

expect the value to be around 32,000

499:37

anyway that was just to count the rows

499:39

additionally I don't want the append

499:41

query to be inside of that January query

499:43

so I'm going to delete this step as well

499:45

instead with the January query selected

499:48

I'll go back to that home append queries

499:51

and then select append queries as new

499:55

this is going to create a completely new

499:57

query once again we want to do three or

499:59

more tables this time I'm going to hold

500:01

control and select all of them and then

500:04

move them over at once make sure we

500:06

don't have duplicates this time so this

500:09

now starts a new query right now it's

500:11

called aend one I would probably name it

500:12

something like data jobs all and then

500:15

pressing enter it then loads in here but

500:17

you can see these queries like imagine

500:19

the case where I right now we have 13

500:21

queries I want to organize these a

500:23

little bit better so we can actually

500:25

group these specifically we can group

500:26

these monthly ones I selected April and

500:29

then holding control selecting all the

500:31

other queries as well then right clicked

500:33

it and I'm going to select this option

500:35

to move to group we need to have a new

500:38

group and I'll call this real uniquely

500:41

data jobs

500:42

monthly and click okay so now we have

500:45

these two folders one with data jobs

500:47

monthly I'm going to close that down and

500:49

then there other queries which we've

500:50

seen before and there's one query inside

500:52

of this of data jobs all this cleans it

500:54

up also you may get this disclaimer up

500:56

here the preview may be up to 33 days

500:58

old feel free to refresh it if you've

500:59

been getting that should have no effect

501:01

on your data then if we wanted to we

501:03

could go through and actually Analyze

501:05

This by pressing close and load to I

501:07

pretty maturely selected close and load

501:10

I recommend you select close and load to

501:11

anyway nonetheless I'll go to the data

501:13

jobs all we'll go to load to

501:16

specifically I want to look at a pivot

501:17

table I know there's going to be some

501:19

data loss because it's going to remove

501:20

the data in the sheet and then I can

501:22

inspect that job posted date

501:24

specifically for the account dragging

501:27

job post date into the rows and then

501:29

also dragging job posted date into

501:31

values and once again this is why we

501:33

double check it this time it looks like

501:36

I accidentally imported in January twice

501:40

with this as we can see that it's

501:42

35,000 anyway opening up that power

501:45

query editor going to the data jobs all

501:47

query and updating it to remove that

501:50

second January that I should have caught

501:51

from before and then close and loading

501:54

it and now it should refresh and update

501:56

for these these correct values boom so

501:59

now it's actually aligned with what I

502:00

expect to see this why we always double

502:02

check any type of query or analysis you

502:05

do this double check of the work is

502:07

going to save your

502:10

butt all right let's now get into the

502:12

bulk of this lesson I'm moving into

502:14

merge for this feel free to continue

502:17

working with that workbook that you were

502:19

working with in the last lesson if you

502:22

didn't Happ to save it or you got lost

502:24

you can use the advanced transform

502:26

workbook from the last lesson that'll

502:28

pick right right back up where we left

502:29

off and then as usual the append and the

502:32

merge are the final examples that you're

502:34

going to see at the end of this which

502:36

specifically for append you've already

502:38

saw so let's actually get into merging

502:40

those queries for this I want to press

502:41

alt F12 and right now we have three

502:45

queries in here the data job salary

502:47

which is basically like our fact table

502:49

this includes all of our data going into

502:52

transform and count rows we have as

502:55

expected around 32 data point points I'm

502:56

going go ahead and delete that Stu

502:58

similarly we have this data jobs skills

503:01

which has all of our skills in it let's

503:03

see how many rows are in this by going

503:05

up to transform and to count rows and

503:07

this has

503:10

167,000 now it's important to understand

503:12

these numbers because we're going to be

503:13

using them or need to understand them

503:15

whenever we actually get into the joins

503:16

to see when we have missing or more data

503:19

so I'm going go ahead and delete the

503:20

step of counted rows as well we don't

503:22

need it then we have also this final

503:24

query of data job skills count this was

503:27

made as an example only we're not going

503:28

to use this any further into the future

503:31

so I'm actually going to go ahead and

503:33

just delete this to minimize my queries

503:35

it's going to ask them I'm sure want to

503:37

delete it yep so let's get into merging

503:39

these queries I have data job salary

503:40

selected come up to the Home tab under

503:43

merge queries we're going to have merge

503:45

queries and merge queries as new like we

503:48

learned from the append of appen queries

503:51

and appen queries is new we're going to

503:53

want a new query so that way we still

503:55

have these Source queries so I'm going

503:57

to go merge queries as new with this

503:59

this merge window pops up and it says

504:01

select the tables and matching columns

504:04

to create a merge table specifically we

504:07

want to go with the data jobs salary and

504:10

we want to merge it on the job ID that's

504:13

why we created that a few lessons ago

504:15

we're trying to connect to the data jobs

504:18

skills on also that job ID now down here

504:23

underneath this there's a join kind and

504:26

there's six different options from this

504:29

of left outer right outer full outer

504:31

inner left anti and right anti now Kelly

504:34

put together this fancy chart that shows

504:37

visually what is happening with these

504:40

merges and we're going to be walking

504:43

through all of these briefly in order to

504:46

understand which type of join you should

504:49

be choosing depending on which scenario

504:52

you're in as a quick overview these

504:54

circles are signifying the two different

504:56

tables so in this case table a and table

504:59

B and the Shaded Blue Area shows what

505:03

portion of the contents from those

505:06

tables will be included in the final

505:09

table first up is a left outer join and

505:13

with this join what's showing here is

505:16

that all rows from table a will be

505:19

included in the final table and then

505:22

from that Center portion right there

505:24

where A and B overlap this signifies

505:26

that it's only going to keep items from

505:29

table B that are in table a or match

505:33

with table a so what does it actually

505:35

mean so if we go here into join kind and

505:37

select left outer and then what we get

505:40

told based on this next to this check

505:42

mark is the selection matches 29,000 of

505:45

32,000 rows from the first table so what

505:49

are those missing jobs well basically

505:51

there's some jobs that don't have a

505:52

skill now this isn't necessarily a bad

505:55

thing although we're not going to go

505:56

with this join this could be an option

505:58

we could use I'm going to click okay to

506:00

load it in so right now we have it under

506:02

this query called merge one and as you

506:04

can see there's not repeating any job

506:07

IDs basically we have the original dat

506:10

jobs salary table and then we scroll all

506:12

the way to the right we have the data

506:14

job skills over here and if you see each

506:17

one of these items is a table if I click

506:20

on it and expand it to see hey what's in

506:22

this table we can see that for this one

506:25

there job posting or job ID of 10,000

506:28

And1 this is the table associated with

506:31

it so I'm going to go ahead and actually

506:32

delete out of this step and go back to

506:35

it so what we could do is expand it out

506:38

and there's this icon up in the top

506:41

right hand corner I'm going to go ahead

506:42

and click it and it's going to ask me

506:45

how it wants to basically expand out and

506:49

in this case I already have the job ID I

506:51

already have job title short I would

506:52

expand it by job skills so now seeing

506:56

how these skills are broken over I can

506:57

actually scroll all the way over and see

506:59

that now 10,000 And1 ID is duplicated

507:02

multiple times and if actually looked at

507:04

the number of rows within this data set

507:07

this new data set we have 170,000 rows

507:12

now technically this merge has exactly

507:14

what we want but we still need to go

507:16

through those other merge examples to

507:18

understand them so we're going to show

507:20

them as well now for this I want to go

507:22

back to that merge window and I'm going

507:24

to click the settings icon I need to get

507:26

rid of the step we're going to be trying

507:27

out different types of merges so I'm

507:29

going to xit out and then go in here and

507:31

click the gear icon now it's popping

507:33

back up we did left outer next thing

507:35

we're going to look at is Right outer

507:37

for right outer this takes all of the

507:40

rows out of table B and then from there

507:43

any that match those rows in table a are

507:47

included now this one when we look down

507:49

here it says Hey the selection

507:51

matches

507:53

167,000 of 167,000 rows from the second

507:57

table if you recall back from that left

507:59

outer we had

508:01

170,000 so 3,000 higher why is that well

508:05

that table a or data job salary has

508:08

3,000 roles in here that don't have any

508:10

skills listed hence why 3000 is less

508:14

this provides a similar type of merge

508:17

that we did before where we need to

508:18

actually go over to that data job skills

508:20

and expand it out selecting the job

508:23

skills column and with this table we can

508:25

just check that we have 167,000 rows

508:28

which bam we confirm all right I'm going

508:30

to get rid of these two steps we're

508:32

going to move into the next merge next

508:34

is inner join and this provides only

508:37

matching rows from table a and matching

508:41

rows from table B so depending how

508:43

you're join it there could be missing

508:45

data on both A and B for this one it's

508:47

saying hey the selection matches about

508:49

29,000 of 32,000 rows from the first

508:51

table which what we expect and then

508:54

basically all of the rows from the

508:56

second table so this one if actually go

508:58

into it and then expand out those data

509:01

job skills looking only at the job

509:03

skills column with it expanded out

509:05

actually counting the rows we have once

509:08

again 167,000 so missing that 3,000 of

509:11

jobs that don't have skills next is left

509:14

anti and in this case it checks to see

509:17

what matches it doesn't have and Returns

509:20

the value for that specifically for

509:22

table a whichever values don't have a

509:24

match it's going to return that so in

509:26

this case it says the selection excludes

509:28

29,000 out of the 32,000 when I go to

509:31

load it I get the rows from table a or

509:34

data jobs salary and it still has the

509:37

data job skills but actually if I looked

509:39

into here right we should be matching on

509:41

things that don't match or don't have a

509:44

value specifically there shouldn't be

509:46

inside anything in this table that I'm

509:47

clicking on and as expected they're null

509:50

values because it doesn't have skills so

509:53

exiting out of navigation going back to

509:54

Source counting these rows we can see

509:57

that we have 3,000 jobs basically with

510:01

no skills for right anti this gets rows

510:04

from the right table that do not have

510:06

matches in the left table and for this

510:09

with right anti- selected this selection

510:11

excludes 167,000 out of 167 rows from

510:14

the second table so basically everything

510:17

from this table is included we're not

510:19

going to walk through this in the power

510:20

query cuz this is also not what we want

510:22

the final one we're going to actually

510:23

use is a full out join from this it

510:27

takes all rows from table a and all rows

510:30

from table B and if there's a match it

510:32

will join those two if there's no

510:34

matches it's still going to return them

510:35

in the table it will just be a null

510:37

value for where it doesn't match up and

510:39

this talks about how basically selection

510:41

matches 29,000 of 32,000 rows from the

510:44

first table and all the rows from the

510:45

second table loading this in once again

510:48

we have data job skills we need to

510:50

expand out and we only want to expand

510:52

out those job skills and then from there

510:54

just going to do a double check I'm I'm

510:55

going to do count rows and this has

510:59

170,000 rows in it so similar to our

511:03

left outer we could have done either of

511:04

these these are one the twos that we

511:06

want but I'm going to stick with this

511:08

one of the full outer because I have all

511:09

the work here any I'm going to close out

511:11

the step and I think that's a great

511:13

example of sometimes there may be

511:15

multiple joins that fit the example it's

511:18

important that you go through and

511:19

actually count the rows and understand

511:22

the data set to figure out which one you

511:23

need to use and for what purpose anyway

511:26

one thing I glossed over real quick

511:27

going back to source and that gear icon

511:30

is right underneath this underneath the

511:32

join kind they have used fuzzy matching

511:35

to perform the merge right now we're

511:37

doing basically exact matching as the

511:39

job ID of 10,1 we're matching up exactly

511:42

with the 10,1 from the other table fuzzy

511:45

matching allows you to connect to tables

511:47

that have basically non-exact matches so

511:50

in this case we have table a with a

511:52

student ID and a student's name and only

511:54

their first name but then in table B we

511:57

have the student name full so first and

512:00

last name and the grade with the fuzzy

512:03

matching we could merge table A and B

512:07

based on that student name First Column

512:09

and the student name full column now

512:12

what happens if we get to where we have

512:13

students with multiple similar first

512:16

names it's going to create a hot mess so

512:18

I don't always recommend using this

512:20

unless you know the data and you know

512:21

you're going to cause complications with

512:23

it so that was a quick overview of of

512:26

the different joins within power query

512:29

if you want a more indepth tutorial for

512:32

how this is done then and you can check

512:34

out my SQL tutorial where I go through

512:36

it with all the different SQL analysis

512:37

that we do in that course and break it

512:39

down step by step I'll include a link to

512:41

that video right here for you to go and

512:43

see

512:46

it all right so we have the final table

512:49

that we actually want for this remember

512:50

these do have duplicate values in it so

512:52

you have to keep that in mind anytime

512:53

you're doing analysis I'm going to

512:55

rename this as data jobs merged one last

512:59

thing for close and load we have this

513:01

job skills column which is sort of

513:02

redundant right now because we actually

513:04

have the data job skills not job skills

513:07

the actual skills itself so I need to

513:09

get rid of this column I actually want

513:10

to do this I'm going to do this in the

513:13

source step before we even break this

513:15

out so I'm going to select job skills

513:17

and select remove columns it's going to

513:19

ask if I want to insert the step which I

513:22

do and then after we remove the columns

513:25

we go into expanding it out and because

513:29

we did it in that order I can actually

513:30

come in here instead of renaming it here

513:32

I can just rename it via the formula

513:36

inside of expanded skills and just

513:37

change it to job skills and Bam now I

513:41

only added two steps Vice one all right

513:44

go ahead now we're going to close and

513:47

load two I'm going to want a pivot table

513:49

and also pivot chart so I'm going to

513:50

select the pivot chart option here and

513:53

underneath quers and connections it's

513:54

going to show that it's loading this in

513:56

here under data jobs merged so let me

513:59

show you what we're going to be creating

514:00

with this I want to build this

514:01

visualization that's showing what is the

514:03

salary of the top 10 skills top 10

514:06

skills by count for data nerds and this

514:09

is a combo chart we're going to have not

514:11

only the salary or the average salary

514:14

for a skill but also for this line

514:17

portion we're going to have the

514:19

associated count for the number of

514:22

skills that appears or how many jobs it

514:24

appears in all right so I'm going to go

514:25

ahead and move this pivot chart out of

514:27

the way and select the pivot table

514:29

remember we want to use the job skills

514:31

we're going to be analyzing that so I'm

514:33

going to throw in the rows the first

514:35

thing I'm going to look at is the

514:36

easiest is the count of these job skills

514:39

and I'm going to rename this to job

514:42

count along with changing the value

514:44

field settings going to number format I

514:47

want to change the number specifically I

514:48

want to use a thand separator with zero

514:51

decimal places I'll go ahead and press

514:52

okay so we have a count now we want the

514:55

average salary so I'm going to take

514:56

salary your average drag it into the

514:59

values right now it's doing a sum so

515:01

I'll go into value field settings select

515:04

average and then for number format we're

515:06

going to do currency with zero decimal

515:09

places click okay and okay again and I'm

515:12

going to change this one to average

515:15

salary and then specify the units of USD

515:19

all right so now xing out of this and

515:21

xing out of this now our pivot chart is

515:24

sort of all jacked up well it is jacked

515:26

up mainly it's trying to PR this as like

515:28

a dual column chart and that's not what

515:30

we want so we're going to change this

515:32

design of it going to design change

515:35

chart type I'm going to go over to combo

515:38

and then underneath here for the combo

515:40

for the job count I want that to be a

515:44

line so I'm going to go up here and

515:45

select line and for the average salary I

515:47

actually want that to be the column now

515:50

I want the job count on a secondary axis

515:53

I don't want the same axis as the salary

515:56

itself because they're just not

515:57

proportional I'm going to go ahead and

515:59

click okay I want to clean this up a

516:01

little bit further by removing the

516:03

legend and then also right clicking here

516:06

and hiding all field buttons on this

516:08

okay there's now there's still too many

516:10

skills on here remember we want the top

516:12

10 skills so going into the pivot table

516:15

itself I'm going to come up into the

516:17

filter into value filters and top one

516:22

we're going to do top 10 items by job

516:24

count all right this is getting a lot

516:26

more readable now because I have the top

516:29

10 by job count I want to order this

516:32

from high to low by salary so I'm going

516:34

to go to more sort options and we're

516:36

going to do descending on average salary

516:40

I'll click okay and Bam now we're

516:42

getting somewhere so we're seeing things

516:44

like spark and AWS have the highest and

516:47

Excel did make the top 10 so it's on

516:49

there at 100,000 other things I'm going

516:52

to change selecting on this pivot chart

516:53

is the actual design itself you know how

516:55

I am about colors so we're going to

516:57

change the colors I'm going to use this

516:59

monochrom MAAC palette 8 I want the line

517:01

to be a lighter color than the actual

517:03

bars itself I'm going to go ahead and

517:05

add access titles for primary vertical

517:07

and secondary vertical for this I'm

517:09

going just select the box go into the

517:11

formula bar and say hey for this one

517:13

make it equal to average yearly salary

517:15

for this one selecting the Box going

517:17

into the formula bar pressing equal I'm

517:19

going to make it equal to job count I'm

517:21

also going to add a title to this I'm

517:24

going toall this of what is the salary

517:26

of the top 10 skill of data nerds and

517:29

remember this is for all data nerds so I

517:32

want to be able to actually what's the

517:34

great thing about this of joining these

517:35

tables now we not only get salary data

517:37

but we can get job title information so

517:39

I'm going to add a slicer now but going

517:41

in pivot chart analyze insert slicer add

517:45

in that job title short only going to

517:47

move that out of the way now I'm going

517:49

to go to slicer I'm going to rename this

517:52

to a more friendly title of job title

517:55

and now now let's actually look at it

517:56

for data analyst so with this looks like

517:59

python arlor the highest Excel still

518:02

makes that top 10 and for data analysts

518:05

at

518:06

86,000 it's also if we look at this it's

518:09

the second most important skill behind

518:12

SQL which has a value of 96,000 let's

518:17

see what it is for a business analyst

518:19

once again SQL and Excel are two of the

518:21

highest and for business analysts Excel

518:23

is paying 87,000

518:26

so bam we just showed the power of well

518:28

append but also more specifically merge

518:31

we can now take this analysis to another

518:33

level analyzing skills to other data

518:36

points from our main fact table or that

518:39

data jobs salary table that has all of

518:41

the data in it so now you have some

518:43

practice problems to go through and get

518:45

more familiar with using both a pend and

518:48

also merge after that we'll be jumping

518:50

into the last lesson of power query

518:53

focusing on the M language as I warned

518:55

at the beginning don't worry if you

518:57

don't have coding experience or anything

518:58

like that we're going to be taking it

518:59

nice and easy and you're going to be

519:00

able to follow along and fill it out

519:02

pretty easily we're going to be doing

519:04

some final prep before we finally send

519:06

this data set on over to power pivot

519:08

which we're going to cover in the next

519:09

chapter all right with that I'll see you

519:15

there welcome to this final lesson on

519:17

the M language and we're going to be

519:20

going into some pretty Advanced

519:22

Techniques and understanding how to read

519:25

and better utilize the M language in

519:27

building your power query queries anyway

519:31

nothing in this lesson is going to be

519:34

used that we actually go through and do

519:36

used to build on our project so if any

519:39

time you're not following along or

519:40

you're not able to do anything don't

519:42

worry too much nothing's actually be

519:44

used it's more to inform you about the M

519:47

language so you get more familiar with

519:48

it as a disclaimer you will not be an

519:51

expert on M language you not be able to

519:52

code in M language after this mainly

519:54

you'll just be able to look look at it

519:56

understand what's going on there from

519:57

there and make slight adjustments if

519:59

necessary feel free to continue working

520:01

on in that worksheet that you've been

520:03

using previously where we just

520:05

calculated in the last lesson looking at

520:07

the top 10 skills and what the salary is

520:09

for them however if you got lost or

520:11

wasn't able to follow along or just

520:13

starting over feel free to use this

520:15

merge notebook don't use once again that

520:18

M language one that one's going to be

520:20

what is going to be done at the end of

520:21

this lesson so what are we going to be

520:23

covering in this lesson well if you open

520:25

up the power query editor we can

520:27

navigate into it we're going to be

520:28

covering three main things first is the

520:31

Z Advan editor actually walking through

520:34

a previous query and understanding how

520:36

to read it and then from there under add

520:38

column tab we're going to go into these

520:41

different examples on creating custom

520:44

columns and also custom

520:48

functions so what exactly is this m

520:52

language well if we dive in

520:54

documentation we can see that the power

520:56

query engine uses a scripting language

520:59

behind the scenes for all power query

521:01

Transformations the power query M formul

521:04

language also known as M so although

521:07

we're doing all these edits inside of

521:08

this power query editor behind the

521:10

scenes if we navigates something like

521:12

the advanced editor it's actually using

521:15

this m language right here to carry out

521:18

all the Transformations and it goes on

521:21

to say if you want to do Advanced

521:22

Transformations using the power query

521:24

engine you can use the advanced Editor

521:26

to access the script of the query and

521:28

modify it as you want it even goes on to

521:31

discuss that if you're not finding what

521:33

you need in the actual GUI or the

521:35

graphical unit user interface of the

521:37

power query editor you can use the M

521:39

language editing it in the advanced

521:40

editor for

521:44

this so let's go into breaking down this

521:46

m language more by going to that data

521:48

jobs merge and entering the advanced

521:50

editor and we're going to be just

521:52

breaking down this simple query right

521:54

here up here on the right hand side

521:56

there's a few different options display

521:57

options I'm going to do this render Whit

522:00

space basically it shows me the

522:01

indentation that's going on here right

522:03

now I'm seeing that there's four spaces

522:05

in here anyway the key thing here is

522:08

we've have first have this let keyword

522:10

and then in keyword this Begins the

522:14

basically definition block if you will

522:16

this whole portion right here for

522:18

defining different variables and

522:19

specifically different tasks if we look

522:23

we have things like source expanded data

522:26

job skills sorted rows remove column

522:28

remove columns if I go ahead and move

522:31

this over to the right those applied

522:33

steps are the same thing those are the

522:36

variables itself I currently have enable

522:39

word wrap enabled and I'm not liking the

522:41

format and how it looks I'm going to go

522:42

ahead and unclick that finally we have

522:44

the in keyword and then this displays

522:48

the final value that we want to appear

522:52

for our query so in this case we want

522:54

the final value of rename columns or the

522:56

last applied step to be what appears now

523:00

this Advanced ER I'm going to expand it

523:01

back out again is also a syntax Checker

523:04

so in this case let's say I deleted this

523:07

quotations at the end of this rename

523:08

columns it's going to one it's going to

523:10

give me these red squiggly lines to say

523:11

that hey there's something wrong here

523:13

and two it's going to actually give you

523:14

an error of invalid identifier and so we

523:18

would probably know that we probably

523:19

need to fix this so we're not going to

523:21

be breaking down much more of the

523:22

formulas here but I do want you to spot

523:25

two main things from this the first

523:27

thing is this column names column names

523:30

are always put in quotes in here and

523:34

conveniently they're also highlighted in

523:36

here so if you needed to do any changes

523:37

to column names or see what's happening

523:40

that's one quick way to identify it the

523:42

next thing is this every step that is

523:44

taken refers to the previous one what do

523:47

I mean by this so this first step is

523:49

assign the valuable variable of source

523:52

and I know it's assign this variable

523:53

because it has an equal sign right next

523:54

to it

523:55

and then whenever we go to the next line

523:57

of expanded data job skills inside this

524:01

function of table expanded table column

524:04

it references source which if I scroll

524:07

over it I can see that it's giving me

524:09

the same formula for source which is

524:11

right above it so basically it's

524:12

plugging right into it similarly this

524:15

expanded data job skills is going to be

524:18

located in the next one below it on

524:20

sorted rows and it's going to be the

524:21

first value in here for this table

524:23

sorted and if you're curious about what

524:26

these different functions are doing you

524:27

can just scroll over it as well in this

524:28

case table. sort sorts the table using

524:31

one or more Columns of names and

524:32

comparison criteria and it tells us via

524:35

the syntax inside the parentheses that

524:37

the first parameter is table is table so

524:39

it takes that previous variable which is

524:41

a table anyway one minor last thing

524:43

about this if you notice these are

524:45

surrounded by these variables have a

524:47

hashtag and then double quotes on each

524:49

side and that's because they have white

524:51

space in the actual names that we're

524:54

doing for this in the case of source

524:56

there's no white space it's only one

524:58

value with no white space so it doesn't

525:00

need to have this around it anyway why

525:02

am I yaen about all this stuff if you

525:03

need to understand this m language

525:05

anyway we're going to actually create

525:07

this data jobs merge query I'm going to

525:09

select it all press contrl C to copy it

525:12

then from there I'm going to close out

525:14

of it we're going to now create a new

525:16

query so underneath the Home tab I'm

525:18

going to go to new source I'm and then

525:21

under that other source and I'm just

525:22

going to go into blank query okay right

525:25

now this is completely blank but I can

525:28

go into that advanced error of query 1

525:31

and it has the let and instill and

525:34

obviously nothing going on here what I

525:37

can do is just highlight this all and

525:39

then using contrl V paste all of that

525:43

other query into this now when I press

525:47

done it goes through and actually

525:51

creates that same exact query from data

525:53

jobs merged now it could could have gone

525:55

through and right click data jobs merged

525:57

and click duplicate but this is more of

525:59

to show that you can actually go in copy

526:02

queries or copy portions of queries and

526:05

thus paste it into other ones which

526:07

we're going to do in a little

526:11

bit so let's get into more of learning

526:13

about the M Language by actually

526:15

cleaning up this query one that we just

526:18

created by using this column from

526:20

example first thing though I do want to

526:22

rename this query one this is the one

526:24

we're be working with for the remainder

526:26

of this lesson and I'm going to call it

526:28

data jobs clean because that's what

526:29

we're going to do we're going to clean

526:30

it up so we have four major tasks that

526:33

we're going to do with this the first is

526:35

for job schedule type I just want to

526:37

extract out the first value out of here

526:39

that's full-time out of it additionally

526:41

we're going to be using the date and

526:42

date time columns to extract the weekday

526:45

and also the hour of the job postings

526:48

and then finally we're going to do some

526:49

data cleanup on this job title column

526:52

that frankly is a mess specifically

526:54

we're going to move job postings that

526:56

have this parentheses remote around it

526:59

anyway let's start with this first one

527:00

of this job schedule type if I go into

527:03

view and then look at the column profile

527:05

it looks like we have that full-time

527:07

contractor part-time and whatnot but we

527:09

have a lot of combines of full-time and

527:12

part-time contractor and temp work

527:13

full-time parttime and internship I

527:15

basically want to go through and just

527:16

extract out what is the first value that

527:19

appears in here so in the case of this

527:21

full-time and parttime just want to

527:23

extract full-time contractor and temp n

527:24

work only contractor so under add column

527:27

and then column from example we'll do

527:30

from selection and this appears at the

527:33

top of add column from examples enter

527:35

sample values to create a new column

527:38

control enter to apply so I'll first go

527:40

by entering fulltime and it's already

527:43

picking it up I'm just going to type it

527:45

in first okay and then I'm going to

527:47

scroll down but in this case I'm going

527:49

to put in hey I want full time for this

527:54

one this is the example remember so now

527:57

it's cleaning up that let's scroll down

527:58

further if it's done this fully for even

528:00

more okay it's getting the first of

528:02

these and you might think that this is

528:04

correct but the problem we're running

528:06

into now is if we go down to this one

528:09

where it says contractor it's only

528:12

contract do and just looking at the

528:14

formula this is the formula it's

528:16

generated so far it's doing teex start

528:19

and nine I don't really know too much

528:21

what's going on here but I'm assuming

528:22

that it's taking the first nine values

528:24

that's not I want so inside this

528:26

contractor one I'm going to type in

528:29

contractor with an R so that way it

528:31

hopefully fixes this so this is good and

528:34

now it has text before delimiter and a

528:36

space so I'm going to go ahead and click

528:39

okay to load this in so let's scroll

528:42

down to just inspect it to make sure

528:44

that we have this correct and an easier

528:47

way instead of scrolling down and trying

528:48

to find something I can just use this

528:49

drop down right here and look in here

528:52

and it looks like we're good

528:55

except for we now have a comma here

528:59

specifically I have a fulltime and then

529:01

a full-time comma so what's going on

529:03

here well for values that have more than

529:06

two so three they actually insert a

529:08

comma in there and when we inspect our

529:12

formula opening up the formula bar here

529:14

it's only checking for a space so the

529:18

easiest way to fix this is actually just

529:21

like we did before we're pretty familiar

529:23

with it let's go to the trans form Tab

529:25

and then under replace values we want to

529:27

go to replace values specifically we

529:29

want to find commas we want to replace

529:31

it with a blank bam so now pulling down

529:34

that drop down we don't have multiple

529:37

different full times we just have that

529:38

single one without the comma we have

529:41

what we want all right we're going to

529:42

rename this and I can just go ahead and

529:44

double click this and rename it but I'm

529:45

actually going to do something first I

529:47

see that I have the step already for

529:49

renamed columns so I'm going to take

529:51

that and I'm going to drag it to the

529:53

Bottom now with rename columns as the

529:55

last step I'll then rename it to job

529:58

schedule type first press enter and then

530:00

it inserts it into that current step as

530:02

we can see from here cuz we're now

530:04

familiar with it and we don't have

530:05

multiple rename columns in there and

530:08

then finally you know how I get about

530:10

column ordering this job schedule type

530:12

first I want it next to the job schedule

530:14

type so I'm going to drag this on over

530:16

here see how long it takes and we've

530:19

moved it over and we now have this new

530:21

step of reordered columns all right

530:23

let's look at some other quick examples

530:24

for column from examples for this we're

530:27

going to be using the job posted date

530:29

for this using column from example I'm

530:31

going to select from selection now with

530:34

some of these things whenever I type in

530:36

this box I want to get let's say the

530:38

year in this case if I were to type in

530:40

four one it would pop up that hey with

530:44

all these different options we can do

530:45

and so this provides a lot of different

530:46

options as far as okay I do know if I

530:49

wanted to do the month I could do that

530:51

and pressing enter it's going to copy it

530:53

all the way down that's not what I

530:54

wanted this case though I'm going to

530:55

double click it again go

530:57

2023 and scrolling down and looking

531:00

through this this option here of year

531:03

from job post to date so we're going to

531:04

go with that then press enter and

531:07

looking at the transform we can see what

531:09

is the m language code that it used for

531:11

this it used the date and year function

531:14

putting in job posted date this is what

531:16

we want we'll click okay you know I I'm

531:19

with naming so we're not going to keep

531:20

this named year so I'm going to modify

531:22

this m language to be job post posted

531:25

year with that renamed let's actually

531:27

move over to our other example

531:29

extracting out the hour for this we're

531:31

going to be using that job posted

531:32

datetime column column from example from

531:35

selection in this case I want the hour

531:37

out of it so I'm just going to put

531:38

something like nine and we can see that

531:42

we also have this here for hours from

531:43

job post to date time I want that one

531:46

press enter again inspecting the M

531:48

language formula it's extracting the

531:50

hour out of this one I'm good with it

531:52

I'm also seeing the other values are

531:53

updating correctly I'll click okay and

531:56

we have our new column called hour which

531:58

you know me we're going to fix this an

532:00

updated hour to job posted hour press

532:02

enter all right now we got it so you're

532:05

probably like look I already know how to

532:07

go something like the transform Tab and

532:08

already extract out that information

532:10

using these functions that we used

532:12

before well that was mainly as a primer

532:14

for this next example we're going to be

532:16

doing and that's that with this job

532:18

title column there's some job titles in

532:20

here that have a lot of sort of

532:22

frivolous information that we don't need

532:24

like in this case supervisor information

532:26

technology specialist and then

532:27

parentheses it has associate director I

532:29

don't need anything in parenthesis

532:31

similarly for this for the senior data

532:33

engineer I don't need this remote in

532:34

here so let's select this job title go

532:37

into add column column from example and

532:41

from selection for this first one with

532:43

the associate director I'm going to

532:44

select it so it appears below and then

532:46

just highlight what I want press contrl

532:48

C and then paste it in here then

532:51

scrolling here through here to do a

532:53

cursor check so I'm seeing that senior

532:55

data engineer remotes in here I could

532:56

select it and copy this down here

532:58

another option is I just go in here

533:00

double click it since it's now

533:02

populating and delete out that remote

533:06

press enter and it looks like it's doing

533:09

this it's getting the text before the

533:11

limiter job title specifically before

533:13

the parenthesis and looks like in this

533:16

case University grad data scientist PhD

533:18

only now hiring it removed all that okay

533:21

so this is now doing what we want click

533:23

okay and I don't want I want to call

533:24

this column text for delimiter I want to

533:26

call this job title clean pressing enter

533:31

all right so last thing I want to now

533:33

clean up these columns and you know how

533:35

I get I want the year an hour to be next

533:37

to the date time the job tile clean be

533:39

next to the job tiles I could drag and

533:41

drop these I'm going to show you

533:42

something else this reordered column

533:45

step we're going to be modifying the M

533:47

language for this and I don't want

533:50

reordered columns to appear more than

533:51

once so I'm going to take it once again

533:53

and drag it to the very end now what I

533:56

can do is take and modify this m

533:59

language that we have in here now if we

534:02

actually inspect this reordered columns

534:04

it may do this or may not in my case it

534:06

didn't add anything after job skills it

534:09

basically let any new columns just fall

534:11

towards the end so this job skills all

534:14

these other columns after it aren't

534:15

included which not a big deal so what I

534:18

want to do is I want to move this year

534:20

and hour to near job posted date and job

534:23

posted month so I'll enter inside of

534:25

here put in job posted year and also job

534:29

posted hour make sure we're putting

534:31

commas after both of those then I'm

534:33

going to run this to make sure there's

534:34

no issues with it and it looks like it

534:36

moved it over inspecting next to job

534:39

post a date we have our month and also

534:40

year and hour all right the last one is

534:43

this job title clean and I want this to

534:45

be right after job title so I'll go

534:48

ahead and put that in right here making

534:50

sure to put a comma after that and then

534:53

from there press ing this check mark up

534:55

here to move it inspecting over we have

534:58

job title clean right next to

535:02

it our next to look at is custom column

535:06

we'll go ahead and actually just select

535:08

this and whenever we pull this up this

535:11

tells us this allows us to add a column

535:13

that's computed from the other column

535:15

provides a box to basically put in the

535:17

new column name but right here this is

535:19

where we put in the custom column

535:22

formula or the M language to maybe clean

535:25

it up now let's start with something

535:27

simple let's say I just wanted to repeat

535:29

the job ID column I would come over here

535:31

select job ID click insert it's going to

535:34

put it in notice that the variable

535:37

itself is inside of brackets and I'm

535:39

going to rename this job ID repeat down

535:43

at the bottom it's telling me that no

535:44

syntax errors have been detected I'll

535:47

click okay and then I get this new step

535:49

for added custom and we can see hey it's

535:52

job ID repeat scrolling over yep it

535:55

repeated it if I want to go back in to

535:57

edit it I'll press that settings icon

536:00

and it's going to pull this back up so

536:02

let's do something a little bit more

536:03

complex now and it going to involve the

536:06

salary year average column and that

536:08

salary hour adjusted column go ahead and

536:11

cancel out of this what I want is to

536:13

create a new column that if there's a

536:15

salary year average value it will

536:17

basically be in that new column and then

536:19

if there's a salary hour adjusted value

536:22

it will be in that column instead

536:24

just as for warning anytime salary year

536:27

average is null there's always a value

536:29

for salary hour adjusted and vice versa

536:32

so like I said we're not going to

536:33

becoming coding experts with this so I

536:36

recommend taking use of chat Bots like

536:38

chat gbt gemini or whatnot lots of free

536:40

options available out there anyway we

536:42

have this prompt of generate a power

536:44

query formula for a custom column on

536:46

building make the column salary your

536:48

average if it's not blank otherwise it

536:50

is salary hour adjusted now it's giving

536:54

do the entire M language right this is

536:56

what we providing to the advanced editor

536:58

providing that previous step name what

537:00

column we're using everything like that

537:02

I care about really this formula right

537:06

here specifically everything after the

537:08

each I'm going to copy this from if all

537:11

the way to the end that's the actual

537:13

code right here going back to the custom

537:16

column I'm going to delete that job ID

537:18

out of there I want to make sure that

537:20

there's an equal sign still there and

537:22

I'm going to paste this in

537:24

and you can see from this this is just

537:26

basically an if formula it's doing if

537:28

salary year average is not equal to null

537:32

then salary year average else perform

537:35

salary hour adjusted down at the bottom

537:37

we can see that no syntax errors have

537:39

been detected so I'm going to go ahead

537:41

and click okay so bam we now have this I

537:44

did in that jav ID repeat value here so

537:47

we're going to actually change that to

537:48

rename that value to salary year

537:50

combined and then clicking the check

537:53

mark in order to to rerun that formula

537:55

to update the column and you know I like

537:57

have my steps in order so I'm going to

537:58

grab reordered column and I'm going to

538:00

drag it to the very end and for this one

538:02

I'm just going to drag it over to salary

538:04

hour adjusted right after it to salary

538:06

year combined so now scrolling down just

538:08

to double check it it looks like we got

538:10

140,000 here 140,000 82,000 82,000 there

538:14

so the formula filled out

538:19

correctly so let's get into our final

538:21

task so we've been working this data

538:23

jobs clean data set we made this salary

538:26

year combined which is pretty useful

538:28

actually what happens now if we want it

538:30

in something like data jobs merged what

538:33

do we need to do to actually add it into

538:36

here because we have everything we need

538:38

for it specifically we have that salary

538:40

year average and we have the salary hour

538:42

adjusted columns well we could recreate

538:45

it in here going through all those steps

538:47

creating that if statement or we could

538:48

just copy it out of the advanced error

538:50

and bring it in here so I'm going to go

538:52

back to data jobs cleaned and then under

538:54

home Advanced editor I'm going to go and

538:57

find the step that's in here

538:59

specifically it was this of added column

539:02

and I'm going to copy it because I can

539:04

see that hey it has the salary year

539:05

combined in it I'm going to copy it all

539:06

the way the the end and I'm going to

539:08

copy it by pressing contrl C okay go and

539:11

close out of this one and then bring

539:12

over to data jobs merged go into the

539:15

advanced editor and I want to insert it

539:17

in right at the end so I'm going to go

539:20

to at the end of this block of this let

539:22

block going to press enter and then from

539:25

there press contrl + V to paste it in

539:28

now I'm already getting an error message

539:30

and it's saying hey token comma un

539:34

basically expected and it's not getting

539:36

it if I scroll over I can see these

539:38

squiggly lines right here basically

539:39

there's not if we can see there's commas

539:42

after every one of these variable

539:43

definitions so I need to come up here

539:45

put a comma in there next is this a

539:48

comma cannot proceed an in so if we

539:50

scroll over we can see this is red

539:53

highlighted probably wrong not to have a

539:54

comma here so we'll get rid of it now

539:56

we're not done it's going to say there's

539:58

no syntax errors but we didn't complete

540:01

this remember you have to have the name

540:04

of the it's got to reference the

540:05

previous name here in it so if I tried

540:08

to even though it says no syntax errors

540:10

if I try to click done and go to load it

540:13

I'm basically getting an error I can see

540:15

this by this basically air Bo at the top

540:18

of each one of these columns also

540:20

there's only one applied step and it's

540:22

calling it data job

540:24

merged of the actual title itself but we

540:28

need to fix this query and actually get

540:30

it back to where it had multiple

540:31

different applied steps so I'm going to

540:33

go back to the advanced editor we're

540:34

going to show what we did wrong here and

540:36

that has to deal with remember we had

540:38

before where we had something like

540:40

remove columns you reference the

540:42

previous column in it so in this case

540:44

remove columns right there well rename

540:47

columns is the last one we had I'm going

540:50

to go ahead and copy this by control

540:51

cing it but yet we have inex inserted

540:54

text before delimiter one which is not

540:57

correct so I'm going to select all of

540:59

that and replace it by pressing crl +v

541:01

so we have the rename columns now one

541:04

other thing we have to do this last

541:06

statement or and the in portion needs to

541:09

be referencing that last variable of

541:11

added custom so I'm going to go ahead

541:13

and copy this contrl C and then pasting

541:17

it in control V click done and now

541:20

scrolling all the way over we can see

541:23

that we have that salary year combined

541:25

column that we created in the last query

541:27

it's at the end we do need to move it

541:29

over but it's in there nonetheless so it

541:31

helps with understanding these queries

541:34

now one quick thing before we go we've

541:35

gone through basically every single

541:37

thing in this chapter on power query up

541:40

to this point with the exception of this

541:43

invoke custom functions this basically

541:46

invokes a custom function defined in the

541:48

file for each row of this table this is

541:50

more advanced and Beyond the scope of

541:52

this course we're not going to be

541:53

covering it but is available for you to

541:55

dive into say you're doing a lot of

541:57

different Imports and you need to

541:59

automate the Imports that you do this

542:02

would be a path you would go but for

542:04

beginners like us I'm going to say stick

542:06

away from it for the time being so this

542:08

now wraps up on the M language and that

542:12

was really a crash course and

542:15

understanding how to use it by no means

542:16

do you need be a professional or be an

542:19

expert coder and codeing the M language

542:21

if you got lost at any point in the way

542:22

nothing to feel ashamed about this is a

542:24

very pretty complex topic if you would

542:26

like to learn more I do recommend this

542:29

book which is M is for data monkey it's

542:31

a good little read talking about not

542:33

only Power query but also how to

542:35

manipulate the M language I'll include a

542:37

link in the description below anyway

542:39

power query in my opinion is one of the

542:41

most important features the most

542:43

powerful tools within Excel and also

542:46

powerbi and so it's worth your time

542:49

investing and learning it and so this

542:51

all culminates and we're now finalized

542:52

covering power query in this chapter in

542:55

the next chapter we're going be jumping

542:57

into Power pivot and that's going to

542:59

jumping into actually data modeling but

543:01

before that for those that purchase C

543:03

practice problems you have some practice

543:04

problems to go through and get more

543:06

familiar with that M language for

543:07

proceeding forward all right with that

543:09

see you in the next

543:13

one welcome to this chapter on power

543:16

pivot and this chapter consists of four

543:19

different lessons where we're going to

543:21

go an intro into Power pivot and over

543:23

the wind window that it actually

543:24

provides then from there looking into

543:27

Dax or data analytical Expressions which

543:31

is a Formula language very similar to

543:33

excel formulas but before we actually

543:35

jump into this lesson and going over

543:37

what we're going for it we're going to

543:39

focus on what exactly is power

543:45

pivot so here I am in Excel and this is

543:48

meant for me to just go through and

543:50

quickly explain what is the power power

543:54

of power pivot I know that pun is

543:56

getting sort of old by now but it really

543:58

is powerful if you're curious of looking

544:00

at it it's in the workbook of power

544:02

pivot intro part one part two is what

544:04

we're going to be using for the actual

544:05

lesson so in power query in the last

544:07

chapter we end up clearing up our data

544:09

set to have these two main tables versus

544:11

data job salary which has the complete

544:13

data set on all the data science job

544:15

postings and then data job skills which

544:17

is unique to the skills for a job we

544:20

also created a data jobs merge table but

544:23

that table is actually going to be well

544:25

it's pretty much Obsolete and power

544:27

pivot is going to help replace that and

544:29

for good reason so what exactly is power

544:32

pivot well it's an addin we're going to

544:33

get to adding it in and it has a few

544:36

different features that you can do

544:37

within it such as accessing the data

544:39

model adding measures kpis and whatnot

544:41

this lesson is going to be going over

544:43

this tab as a quick refresher power

544:46

pivot is going to be available in

544:48

basically any version of Windows for

544:50

Microsoft past

544:52

2010 but it's completely not available

544:55

in either the Mac version or the

544:57

Microsoft online version so you won't be

544:59

able to do this chapter if you have

545:01

those versions or the final project

545:03

anyway the core portion of power pivot

545:06

is actually managing a data model and

545:09

what's a data model well a data model

545:12

defines how data is basically structured

545:15

stored and also related in this case we

545:20

have the data jobs salary table right

545:22

here and we have the data jobs skill

545:25

table what we can do with power pivot

545:27

besides modeling these tables and

545:29

showing how they're structured is the

545:31

more important thing of creating a

545:32

relationship in this case I created a

545:34

relationship between the job ID of data

545:37

job salary and that of data job skills

545:40

and because I created this relationship

545:42

I can look at things like the job title

545:44

shot short column see how many jobs it

545:46

has with it but also I can query across

545:50

a table over to the job skills and see

545:52

how many skills has with it in fact

545:54

let's actually do that real quick here I

545:56

have my data model itself I have my two

545:58

tables which are shown anyway I can look

546:01

at things like what are the count of the

546:03

different job titles themselves I'm

546:05

going to do that on job ID and like

546:07

we've done plenty of times before here's

546:09

the job count with a little clean up of

546:10

the actual text here but now with power

546:13

pivot I can actually reach across to

546:15

that other table of data job skills and

546:17

drag the job skills into here and this

546:20

is telling us obviously the count of the

546:22

skills based on the job title pretty

546:26

cool that we can reach across the tables

546:27

and do this now the other cool thing

546:29

that power pivot unlocks is Dax or data

546:33

analytical Expressions recall previously

546:35

that we were using the average of the

546:38

salaries and like we learned way back

546:40

earlier in this Excel course we prefer

546:42

actually a median salary but

546:44

unfortunately looking at the value fied

546:46

settings window here there is no option

546:50

to actually pick median from this and

546:53

that's where where Dax comes to the

546:54

rescue with this I can go to something

546:56

like the power pivot Tab and now create

546:59

a measure which is where you actually

547:01

insert in your Dax and I can create a

547:04

new one called median salary and we're

547:06

going to be using this Dax formula in

547:08

this case I'm going to use the median

547:09

formula very similar to the Excel

547:11

formula and I can do it on the entire

547:14

salary year average column here I'm

547:16

going to format it real quick and then

547:18

press enter anyway bam now we have

547:21

because of the power of Dax we have the

547:23

ability to get the median salary and

547:27

those Dax things can do some pretty

547:29

complicated calculations so in the case

547:31

of here we have this job count and count

547:33

of skills and we want to see what were

547:35

the skills per job specifically in this

547:38

case what is something like C2 / B2 and

547:42

then dragging all the way down and

547:43

filling it for all these this provides a

547:46

much better analysis of what's going on

547:49

with these values of counts and skills

547:51

here when we get this proportionality we

547:53

can create this with measures as shown

547:55

in this final pivot table that we're

547:56

going to be creating coming up in the

547:58

third lesson of this chapter so in

548:01

summary power pivot provides us the

548:03

opportunity to now model our data which

548:06

allows us to one create relationships

548:09

and two allows us on unlocks these

548:12

measures that we can create using

548:17

Dax all right so let's get into this

548:20

lesson what we're going to be focused on

548:21

for well first thing is we're going to

548:22

enable the power power pivot plugin and

548:24

then from there actually getting in to

548:27

data modeling or modeling our data that

548:30

we imported through Power query after we

548:33

have everything set up with our data

548:34

model we're going to then move into

548:36

performing our first analysis analyzing

548:39

based on a job title how many different

548:42

skills they have associated with it like

548:45

I said we'll eventually get to that

548:46

skills per job in an upcoming lesson so

548:49

for this you can continue to work in

548:51

that workbook that we were working with

548:54

in the last chapter EMP power query

548:55

we're going to continue work on that

548:57

because we want to use those queries

548:58

that we built if you got lost dur in the

549:00

way and just want to start back up we're

549:02

going to be starting from that M

549:04

language workbook back in the power

549:05

query chapter as a reminder these

549:08

lessons or workbooks are what are the

549:11

completed workbooks at the end of the

549:13

lesson specifically for this lesson part

549:16

one was just that intro part two is what

549:18

will be done at the end of this lesson

549:24

anyway here I am in the M language

549:25

workbook we need to get into enabling

549:27

power pivot right now you probably don't

549:29

see Power pivot up at the top of the

549:30

tabs so I'm going to go into file and

549:33

then go down to options from here I'm

549:35

going to select add-ins like we did

549:37

before and instead of excel addins we're

549:40

actually going to be using those Comm

549:41

addins I'm going click go and they have

549:44

three different ones available data

549:45

streamer power map and power pivot we

549:47

want Power pivot I'm go ahead and click

549:50

okay now power pivot should appear up at

549:53

the top all the way on the right hand

549:55

side and should look something like this

549:58

quick little overview of this tab manage

550:00

here pops up the power pivot window

550:04

which we're going to be doing a deep

550:05

dive on this in the next lesson we're

550:08

going to use it a little bit in this

550:09

lesson but anyway that's one way you can

550:10

actually access it you can also go to

550:13

the data Tab and then here under data

550:16

tools you should see it also and you'll

550:19

be able to manage your data model and

550:22

once again it will pop up the window

550:23

additionally on this tab you have the

550:25

ability to create measures and kpis

550:28

which going to be diving deep into in

550:30

the third and fourth lesson if you have

550:32

a table within your worksheets you can

550:34

add it to your dat model you can also go

550:36

about detecting relationships although I

550:38

don't find that this feature works that

550:40

well and then finally they have settings

550:42

and settings I don't really touch that

550:44

much nor does it have much control

550:48

here so let's actually get into EMB

550:51

boarding some data into our data model

550:53

we're going to do a simple example first

550:55

here I created a new sheet made three

550:56

columns of ID name salary and then

550:59

different values associated with it one

551:01

way I can add to the data model is if I

551:03

have data in a table is to do this

551:05

feature of add to data model in this my

551:07

table has headers I'll go ahead and

551:09

continue and then it will pop open power

551:11

pivot a similar like environment will

551:14

exist with Excel I can't actually edit

551:17

any numbers in here this is just how

551:19

you're modeling your data if you needed

551:21

to actually edit it I have to go back to

551:22

the sheets and like I said this isn't a

551:24

method I typically use typically have

551:26

bigger data sets not located in tables

551:28

so I'm going to go ahead and rightclick

551:29

this down at the bottom this table name

551:31

of table two click delete it's going to

551:33

say hey do you sure you want to delete

551:35

this table and Bam it's gone all right

551:38

so now there's nothing in our data model

551:40

right now here we are still inside the

551:42

power pivot window and if you've noticed

551:44

from this in the Home tab right here it

551:47

has the option to get external data they

551:50

have options for you to actually connect

551:52

Direct ly with power pivot to things

551:55

like a SQL Server Microsoft Access you

551:58

could also get it from some sort of data

552:00

feed and then this option would be more

552:02

probably useful in that it has a lot of

552:04

different sources you could use such as

552:06

other Excel files text files such as

552:08

csvs and whatnot now you may be asking

552:11

yourself I'm going to close out of this

552:12

power pivot why would I import of that

552:15

whenever we just went through with power

552:18

query to get data via this when which

552:22

time should I use which well it's very

552:25

important to remember the purpose of the

552:28

tool that you're using power query is an

552:31

ETL tool extract transform and load we

552:35

did a lot of Transformations with our

552:38

data set and so that's really the power

552:41

of power query and then it loads it in

552:45

power pivot strengths is not in ETL or

552:47

data cleaning instead it's in data

552:50

modeling creating these relationships

552:51

and Dax now now you may be tempted to

552:54

come inside of existing connections and

552:56

try to connect to specifically that

552:59

salary and skills and if we went through

553:02

like in the salary case and try to click

553:05

open we're going to get an error message

553:07

and I'll be honest this is really

553:09

confusing because we have this workbook

553:11

connections why isn't this working well

553:13

it really just comes down to naming

553:15

conventions and that the fact that power

553:17

query connections are not the same as

553:19

power pivot connections but we have a

553:20

fix for this we just need to exit out of

553:22

the power pivot window here inside of

553:25

queries and connections remember you can

553:27

get to that by going to the data Tab and

553:29

going to queries and connections we can

553:30

go to something like data job salary

553:32

which right now is a connection only

553:34

rightclick it and go to load to right

553:37

now it's only under only create

553:40

connection but we need to check this

553:42

check mark of add this data to the data

553:46

model I'm going to click okay it's going

553:48

to go through this process of loading

553:50

the data and now it talks about the rows

553:53

are loaded but mainly if I go to the

553:56

connection it has this new connection

553:59

now of this workbook data model which if

554:02

I go to and actually open up or manage

554:05

our data model we can see that it's

554:07

inside of here we have this basically

554:09

sheet for the table itself of data job

554:12

salary inside power pivot inside the

554:15

data model now we do need to get that

554:17

other pivot table or other table into

554:20

there as well so I'm going go to queries

554:22

data job skills s right click this load

554:24

to and also add this to the data model

554:27

okay it talks about 167,000 rows are

554:29

loaded and another connections still

554:32

it's only going to be one connection

554:34

because we only have one data model in

554:35

this case and now when I go to manage

554:38

the data model I have two basically

554:41

sheets down here but two tables and now

554:43

we have the data job skills in

554:48

here anyway I want to do some cleanup

554:50

real quick I'm going to clean up power

554:51

pivot but this data jobs merged and this

554:54

data jobs cleaned it's going to be very

554:56

confusing like I said we're not using

554:58

this mainly for the fact that we have

555:01

duplicate values in here for senior data

555:04

scientists in this case and then for the

555:06

salaries and so if we don't manipulate

555:08

this in a correct manner we're going to

555:11

get the wrong results so we're just

555:13

going to get rid of these so for data

555:15

jobs merge I'm going to write click and

555:17

select delete and it's going to say hey

555:20

should you want to delete data jobs

555:22

merge yes I do and then I'm going to do

555:24

the same thing with data jobs clean

555:26

right click it and select delete also if

555:29

you have these tabs down here for data

555:30

jobs clean or merge you can go ahead and

555:32

delete those as well with our models now

555:35

cleaned up let's actually get into going

555:37

over really briefly this power pivot

555:40

window with this we have three main tabs

555:42

of Home Design and advanced advanced

555:45

we're not going to go into a lot of

555:46

things inside of this if any at all it's

555:49

beyond the scope of the course we're

555:51

going to be focusing mostly on the home

555:52

and the design t tab so with this tab

555:54

we've already gone over get external

555:56

data but we can do things like refresh

555:57

our data if we know that it's updated in

555:59

power query generate pivot tables and

556:01

pivot charts based on our data model

556:04

itself change the formatting of a

556:07

particular column in this case is

556:09

noticing as text if we go to the data

556:11

jobs salary data we can actually scroll

556:14

over and see that for the salary your

556:15

average column it knows that it's a

556:17

currency we did a lot of this cleanup

556:19

right in power query and setting these

556:21

different data types so this saves a lot

556:23

of steps here in power pivot if it

556:25

wasn't done now we have options

556:27

displaying the table below that we can

556:28

actually sort it we can filter it or

556:30

sort by a certain column they also

556:33

provide options to find a specific value

556:36

within here and then these features for

556:38

calculations I don't find myself using

556:40

that much as far as the auto so anyway

556:42

over on the right the most important

556:44

thing I find is allows you to toggle on

556:47

the different views of your data set so

556:49

right now this is the data View and if I

556:51

scroll over here this is the diagram

556:54

View and this is going to show our two

556:57

different tables side by side I'm going

556:58

to move them over and actually expand

557:01

this one out to show all the different

557:02

columns and then the data job skills now

557:05

back on that data view clicking that we

557:08

have data view but also below this we

557:10

have this calculation area which I can

557:12

toggle on and off calculation areas are

557:16

where we're going to be storing our

557:18

different measures that we build with

557:20

dacks and so they'll be appearing

557:22

underneath here here if we have any

557:23

hidden columns we'll be able to toggle

557:25

them on and off right now I don't have

557:27

any hidden columns now one thing to note

557:29

with this data cleanup some of that we

557:30

did before with formatting stuff some of

557:32

it's going to be quite limiting you may

557:34

not be able to do like in the case of

557:36

this so data job skills has this job

557:37

title short column and actually if we

557:40

look at the data jobs salary data set we

557:43

have the same repeated column in it so

557:45

data job skills this job title short

557:47

right here is unnecessary now I could

557:49

rightclick it and try to delete the

557:52

column

557:53

and ask me if I want to delete it it's

557:54

going to tell me it's not going to be

557:56

able to do it because it was created by

557:57

a query I.E through Power query and

558:00

instead I should actually update it

558:01

through Power query which I would

558:03

actually argue as best practice anyway

558:05

so I could exit out a power pivot launch

558:07

power query by pressing alt F12 then go

558:11

into the data jobs skills query and if I

558:14

want I can just select this column and

558:16

select remove columns but you know how I

558:19

am I like to actually clean up the

558:21

applied steps because it could depending

558:23

on how large your power query query is

558:26

it could take a long time to load it and

558:28

unload it necessary so if I go to this

558:29

remove other colums that's the first

558:33

time that it appears in it I can remove

558:35

this by deleting it out of there then

558:37

pressing enter we may get an error

558:39

message we may not I'm not sure going to

558:42

the last step in here I notice there one

558:45

thing of the table wasn't found

558:47

specifically here it's appearing job

558:49

title short in here so I can go ahead

558:51

and delete job title short along with

558:52

with that comma and Bam we now have this

558:55

Final Table just to lean for those two

558:57

steps I'm going to go ahead and close

558:58

and load this and now going back in to

559:02

look at our data model and power pivot I

559:04

can see that it updated for data job

559:09

skills all right moving into this design

559:12

tab within power pivot this has a few

559:15

different options within it for adding

559:17

columns freezing columns just messing

559:18

with the columns they also have

559:20

different options for creating

559:21

calculations concerning columns we'll be

559:24

getting into calculating columns more in

559:26

the next lesson so stay tuned for that

559:29

right the main thing that we're actually

559:30

going to be doing in this portion of the

559:31

video is actually setting up

559:33

relationships and that is we could go

559:36

about creating a relationship here and

559:39

right now I have data job skills and I

559:42

could relate it with the job ID by

559:45

pulling the drop down to the data jobs

559:47

salary table on that job ID now that's a

559:50

way I can do it I'm actually not going

559:51

to do it this way I actually prefer

559:53

going to the diagram View and then from

559:57

there just dragging and dropping the job

560:00

IDs across each other and then it

560:02

establish this connection which we can

560:04

see through this line through here now

560:07

there's a few different things that we

560:08

need to notice from this line here one

560:12

this Arrow it's going to come to bite Us

560:14

in the butt later and that's that that

560:16

Arrow only allows data flow in One

560:20

Direction and by data flow I mean

560:22

filtering if I try to filter something

560:24

in the data job skills table this arrow

560:27

is only pointing in One Direction I

560:29

won't be able to filter it back we'll

560:31

encounter those problems in a little bit

560:32

and we'll talk about strategies how to

560:34

actually offset it the other thing to

560:36

note with this relationship here is you

560:38

notice right here it says one and over

560:40

here it says star in this case this is a

560:44

one to many relationship and what does

560:48

this mean well going to our data view

560:50

for data job salary we only have one

560:53

unique ID for each job whereas in the

560:57

data jobs skills we have multiple

561:01

different job IDs or many job IDs now if

561:05

we only had one job ID in there and we

561:08

actually looked that diagram view for

561:10

this relationship we'd have a one to one

561:12

relationship but we have multiple skills

561:13

in there so that's not possible now it's

561:15

also possible to have a basically as to

561:18

ASIS or many to many relationship but

561:21

that causes a mess slows down your data

561:24

model and I don't recommend it so you

561:25

should typically see either a one to one

561:28

or a one to many last little wrap up

561:30

before we actually analyze and use this

561:32

relationship we have the options for

561:34

table properties which we're not going

561:35

to be able to look at because this was

561:37

created the a power query for this

561:39

connection and then we have options to

561:41

create date tables underneath calendars

561:44

which we're going to be exploring in an

561:45

upcoming lesson and like always you have

561:48

a undo and redo anyway let's actually

561:50

get into analyzing and putting this

561:53

actual relationship to the test so what

561:56

we're going to do is inside the Home tab

561:58

go to pivot table CU we're want to

561:59

create a pivot table with this we're

562:01

going to insert a pivot table and we'll

562:03

have it insert into a new worksheet

562:06

selecting inside the pivot table it's

562:07

not having the field list come up so

562:09

I'll select it under pivot table analyze

562:11

anyway we want to query across this

562:13

table to show the power of the

562:15

relationships so what I'm going to do is

562:17

from the data jobs salary table I'm

562:19

going to take that job title short throw

562:21

it into the r those and then from there

562:24

going to come down to the data jobs

562:27

skills table and I'm going to throw the

562:29

job skills into the values it should be

562:31

performing a count and then I'm going to

562:35

organize this real quick from largest to

562:37

smallest and it looks like data

562:39

Engineers have the most so this is

562:41

pretty neat we're able now to query

562:43

across tables going back into that power

562:46

pivot window this connection allows us

562:48

to do that I'm going to just show you

562:49

something real quick by clicking this

562:52

Rel ship right clicking it and deleting

562:54

it want to delete for model and I want

562:56

to show you how these values are

562:57

basically going to change inside our

562:59

pivot table basically to the fact that

563:02

they're going to have it to where

563:04

they're all the same value and that's

563:06

how you know that your relationship is

563:08

not set up correctly whenever you have

563:10

multiple repeating values and you expect

563:12

them not to be anyway sometimes you'll

563:14

see this popup come up of relationships

563:16

between tables may be needed

563:18

autodetect and sometimes it works

563:21

sometimes it doesn't um in this case it

563:23

looked like it worked so we're going to

563:25

go with it and just double- checking it

563:27

in power pivot it is set up

563:31

correctly so for this final analysis

563:34

we're going to be looking at building

563:36

this visualization right here analyzing

563:38

what are the top skills of data nerds

563:41

we're basically remaking what we did in

563:43

the power query chapter now that we have

563:45

that updated data model anyway we're

563:47

going to build this out to see where the

563:48

skills counts for each of these and also

563:51

provide filters for job country so back

563:54

inside the workbook that we're

563:55

previously working with if I would

563:57

actually remember we did make that sort

563:59

of similar visualization that I talked

564:00

about but however if I go to data and

564:03

actually refresh the data it's going to

564:06

give me this error message because once

564:07

again we deleted dat jobs merged anyway

564:10

I thought this was actually going to go

564:12

away it didn't it is not what we want

564:14

we're going to delete this one and then

564:16

we're going to do a little bit of

564:17

cleanup so that one that we created the

564:19

job analysis on I'm going to actually

564:21

just rename that quick to job analysis

564:25

and then now in this new sheet we're

564:27

going to do we're going to name this one

564:29

skill job analysis anyway let's insert a

564:32

pivot table in here so we go to insert

564:34

pivot table and now what we have the

564:36

option for is from data model and it's

564:40

ask if I want to put it in the existing

564:41

worksheet yes I do remember we want to

564:44

analyze the skills and specifically how

564:47

many counts they have associated with it

564:49

or how many jobs they have associated

564:50

with it so I'm put the skills into into

564:52

the rows and then from there I want to

564:54

count how many jobs are associated with

564:56

it so I'm just going to drag that job ID

564:58

into the values right now it's doing a

565:00

sum going click on it go to Value field

565:02

settings change this to count now you

565:04

may be like Luke could we use the job

565:08

skills count and we can which has the

565:11

same exact values but actually closing

565:13

this out and taking out job skills

565:15

you're probably more interested in why

565:18

can't I use something like the job ID

565:19

from the data job salary table well if

565:22

drag that over and then I change this

565:25

value field setting to account count and

565:28

click okay you notice it says

565:31

32672 which is coincidentally the same

565:34

number of rows of that data set and this

565:36

gets into the point of filter Direction

565:38

what do I mean by that let's go back to

565:40

the data model itself looking at it in

565:42

diagram view remember the arrow is

565:45

pointed towards the data job skill table

565:48

right now I have job skills in the rows

565:51

and I'm trying to filter for data job

565:54

salary based on the count of the job IDs

565:56

but the arrow doesn't flow in that

565:58

direction we can't do it now in

565:59

something like powerbi you can actually

566:01

rightclick this edit the relationship

566:03

and change the direction that's not

566:05

possible within Excel unfortunately

566:08

anyway we're going to be using Dax to

566:09

fix this in the future for the time

566:11

being we're just going to go about using

566:14

in this case for this analysis the same

566:16

values in the same table I'm going to

566:18

remove this other job ID from the other

566:21

table anyway we're going to sort these

566:23

values from largest to smallest then

566:25

additionally I only want to show the top

566:28

10 skills so I'll go to Value filters

566:31

and then top one dot dot dot top 10

566:34

items by count of job ID is what I want

566:37

and so now we have this so now we have

566:39

the values we want to visualize I'll go

566:41

in and actually insert a pivot chart for

566:44

this I like the bar because it makes it

566:46

easier to read the different skills that

566:48

it has right there and I'm realizing now

566:50

the sword order is actually back

566:52

backwards in this I want it from

566:54

smallest to largest I'm also going to

566:56

right click and hide all field buttons

566:58

we're also going to be adding access

566:59

titles for the primary horizontal and

567:01

then removing that Legend we'll update

567:04

this title to what are the top skills of

567:06

data nerds and then the y- axis is

567:09

self-explanatory but for the x-axis

567:12

we'll label this skill count in job

567:14

postings okay the last thing we need to

567:16

do now is actually add some slicers to

567:19

this so we can actually control it

567:20

better so selecting the table itself

567:23

going to insert slicers I'm going to

567:25

select the job title short and also we

567:28

want job country right here which each

567:31

of these slicers I'm going to rename

567:33

them also this one job title short I'm

567:35

going to rename to job title and then

567:37

job country I'm going to rename to

567:40

Country now when I go through I can

567:43

actually select something like data

567:45

analyst and it will filter down and

567:48

actually see the associated skills I

567:51

could also do something like like look

567:52

at those in the United States

567:54

specifically for their counts and we see

567:56

that SQL Excel and Tableau are the three

567:59

top skills now you may be scratching

568:01

your head on like okay I thought we were

568:03

trying earlier to actually aggregate

568:05

something in the pivot table and it

568:06

didn't work well remember this arrow is

568:10

pointing to the filter Direction so in

568:12

our case we have a job title short

568:14

slicer because this arrows in the

568:17

direction back to the data job skills

568:19

table we can filter in that direction

568:22

but we cannot conversely filter in the

568:24

other direction that's why we can't get

568:25

the counts from these tables little

568:27

confusing I know but I promise you we

568:29

will work out as we go through this

568:31

entire chapter in power pivot so bam we

568:35

just completed our first analysis for

568:37

our final project we have a few more

568:39

analysis coming up in the next lessons

568:41

you do have some practice problems

568:43

though to go through and get yourself

568:45

more familiar with power pivot and

568:48

understanding what's going on with these

568:49

relationships the one to many and

568:50

whatnot all right with that I'll see you

568:52

in the next one which we're going to do

568:54

a deeper dive on looking into that power

568:57

pivot window that I'll see you

569:02

there all right let's now dive further

569:05

into Power pivot and we're going to be

569:06

focusing on the power pivot window for

569:08

this we're going to be looking at some

569:10

major aspects of it for this we're going

569:12

to get into using a little bit of Dax to

569:14

create our first measure and with those

569:17

measures we're also going to be

569:19

exploring the difference between

569:20

implicit and explicit measures don't

569:22

worry we'll cover that in a bit from

569:24

there we're going to move into a feature

569:26

that's related to measures called

569:28

calculated columns and it's going to

569:30

allow us to inside of our data model

569:32

create different values such in this

569:35

case we can actually create a date colum

569:38

from our date time value the last thing

569:40

we'll explore are date tables which

569:42

power pivot gives with a click of a

569:44

button and allows us to connect these

569:46

data tables of these date tables to our

569:49

original data source and then filter it

569:51

by a lot of different data and so we'll

569:53

wrap this all up with a final analysis

569:55

where we're looking at job postings

569:57

based on a day of week using this date

570:00

table anyway jumping into Excel for this

570:02

we're not going to be using any of the

570:04

work that we've done previously instead

570:06

we're going to open up a completely new

570:08

workbook and be working out of this

570:10

instead and the reason is all the work

570:13

that we're going to be doing within this

570:14

lesson we're not going to be carrying it

570:16

on to our project that we're going using

570:18

this is more this lesson is more to get

570:20

us more familiar with the powers power

570:22

pivot oh gosh this pun's killing me and

570:25

so we'll eventually incorporate some of

570:26

the stuff into our final project but

570:28

like I said we're going to be starting

570:29

with a blank notebook or workbook for

570:31

this as always if you want to see what

570:32

the results are at the end of this

570:35

lesson you can just go to Power pivot

570:37

window and it will have

570:41

it all right so let's actually get some

570:44

data into here to start working with and

570:46

like I said we're not going to use power

570:48

query at all for this we're going to use

570:49

power pivot so I'm going to open up the

570:51

goto to the manage the power pivot data

570:53

model and we want to get this external

570:55

data specifically we want to get that

570:57

Excel workbook that we've been working

570:59

with of data jobs salary all so

571:02

underneath the Home tab I'm going to go

571:03

to get external data and it's going to

571:05

be from other sources we scroll all down

571:08

we could look at how we can import it

571:09

from different databases or whatnot

571:11

we're going to be doing it from an Excel

571:13

file then from there we're going to

571:15

browse the connections navigating into

571:18

that data set folder I'm going select

571:19

data jobs salary all it PR me if I want

571:22

to use the first row as column headers I

571:24

do if I wanted to I could go in and test

571:26

the connection to make sure it's it's

571:28

going to succeed and it does so we'll go

571:30

from there to next it sees that it has

571:32

one sheet within the workbook that's the

571:34

one that I want I'll click finish next

571:36

it'll go through the import looks like

571:38

it completed it has a success got 32,000

571:40

rows I'll click close now let's go

571:43

through and actually clean this data set

571:46

up using power pivot now I know in the

571:48

last lesson I talked about hey we're

571:51

using power query for ETL and that's

571:53

true but let's say you have a quick data

571:55

set you need to connect to and model

571:57

quickly in that case you would do some

571:59

of the stuff that I'm going to do here

572:01

in order to quickly model it if I wanted

572:03

to rename it I'd come down to this

572:04

basically sheet tab down here it's

572:06

called sheet one after where it's at

572:08

I'll rename it and we'll keep a similar

572:10

naming Convention of Jatt jobs salary go

572:14

ahead and click enter so let's say for

572:15

this quick analysis that we're trying to

572:17

do in this lesson I'm trying to analyze

572:19

only the yearly salary data I don't care

572:21

care about the salary uh hourly data and

572:23

I don't even want the data entries in

572:25

here well I can get rid of that salary

572:28

hour average row by just deleting this

572:30

column by right clicking it it's asking

572:32

me if I want to delete it yes and now

572:34

there's still blank values in here right

572:36

so I need to get rid of this salary rate

572:39

values that are equal to hour so I'm

572:41

going to click the filter here unclick

572:43

next to hour and click okay so now we

572:46

have that out the other thing I want to

572:48

do is actually clean up the format of

572:50

the salary and I'm going to change that

572:52

instead to a currency and this talks

572:55

about how the data is going to be a

572:56

changed when where it's stored yeah I

572:58

don't really care about that no doubt

573:00

that I care about will be lost it'll all

573:02

be here still and then I'm going to

573:04

reduce the decimal places by two the

573:06

other thing I can do if I wanted to is

573:08

actually sort this based on that job

573:10

posted date could come up here and sort

573:13

from newest to oldest and then it's in

573:15

order sorry actually want it oldest to

573:17

newest got confused on that one so bam

573:20

just did some quick clean up to our data

573:22

set and now we're ready to proceed

573:27

forward so let's actually get into

573:29

building our first measure or measures

573:33

specifically I want to analyze this to

573:35

understand what are the different the

573:37

the amount of jobs in here and then also

573:39

what is the average and then also more

573:41

importantly the median salary well

573:44

there's a few different ways we can do

573:45

this we're going to do this first within

573:47

this power pivot window so in order to

573:50

do this I'm going to first first I want

573:52

to do a count so we're going to just run

573:54

this on this job title short column and

573:57

here underneath on the Home tab under

573:59

calculations we have this Auto sum I

574:01

don't frequently use this I use it every

574:03

now and then but I can run things on

574:05

this like count or distinct count I'm

574:07

going to do count in this case and this

574:09

is going to create our first measure

574:11

down here remember down below this area

574:14

is our calculation area I can toggle it

574:16

on and off by clicking calculation area

574:18

up here anyway I can also make this

574:21

column slightly bigger and what's cool

574:23

about this keep on scrolling over is now

574:27

it tells us the name of this measure

574:29

count of job tile short and that there's

574:31

22,000 remember there's normally around

574:34

30,000 but because we've taken out that

574:36

hourly data we're down to 22,000 now I

574:39

can also edit this measure if you notice

574:43

it appears right up here similarly they

574:45

have a formula bar in power pivot and to

574:48

the left hand side it tells you what is

574:51

actually selected job title short column

574:53

and then the actual measure itself in

574:55

here now one quick note there is

574:59

basically a colon and then an equal sign

575:01

that's how we're going to know that

575:03

we're doing measures and we'll get to

575:05

calculate columns in a little bit and it

575:07

will only be the equal sign but this is

575:09

Microsoft's way of signifying that this

575:12

we're using a measure so that way you

575:14

don't confuse with anything else anyway

575:16

I can edit this the actual title in this

575:19

case and I can change this something to

575:21

more more descriptive to job count

575:23

pressing enter it now runs it and it's a

575:26

lot shorter additionally if I want to

575:29

actually format it I can have the

575:31

measure selected come up here select

575:33

comma and then it formats it with the

575:35

comma and then I don't want two decimal

575:37

places I'll go ahead and remove it next

575:39

let's get into analyzing that salary

575:42

column with this once again we can click

575:44

this I could use that auto sum and do

575:46

something like average here clicking

575:48

average and below it it generates that

575:51

average

575:52

of salary or average 123,000 and I can

575:54

change it if I want to average salary

575:57

but if I wanted to calculate something

575:59

like the median instead I would have to

576:02

actually manually type out this

576:03

calculation so selecting right below

576:05

average salary and then coming into the

576:07

formula bar I can type in something like

576:09

median salary remember we want to create

576:12

a measure so it's going to be a colon

576:14

and then an equal to and then for this

576:16

we want to use the median function now a

576:19

lot of these functions that are Dax

576:22

functions are very similar to what we

576:24

use in Excel so they have a lot of

576:25

different similarities but with this

576:28

like we talked about before this allows

576:30

us to now put in basically an entire

576:33

column into it and then perform that

576:35

entire aggregation on it in this case I

576:37

want to do it all on salary year average

576:40

making sure I put a close parenthesis to

576:42

close out that function and Bam right

576:45

next to average salary we have this

576:46

median salary now which needs to be

576:48

formatted so I'll format it as English

576:51

United States stes USD and remove the

576:53

decimal places now what happens if I

576:55

didn't enter that colon equal sign so

576:57

here I am selected below median salary

576:59

we'll go ahead and paste in that formula

577:01

and we'll delete that colon I haven't

577:03

run this yet now I'm going to run I'm

577:04

going to press enter and as you notice

577:06

by this it's not actually calculating a

577:09

value it actually just converts this to

577:12

text so this is not what we want that's

577:14

why we have to do the colon equal to

577:16

sign for entering in the formula bar

577:19

there

577:23

so with measures it's important to

577:24

understand implicit vers explicit

577:27

measures so let's close out the power

577:28

pivot window and actually getting into

577:31

exploring these different measures by

577:33

creating a pivot table of that median

577:35

salary we just created so I'm going to

577:37

go to insert pivot table from data model

577:40

we're going to insert it into the

577:42

existing sheet here we have our table of

577:44

data job salary I'm going to analyze the

577:47

salary based on the job title short

577:49

column so I'll put job title short into

577:51

the rows and then look we scroll down at

577:55

the very bottom you'll notice that the

577:58

measures that we created have this F ofx

578:01

basically it shows us an equation that

578:03

it is a measure so I can take these

578:06

measures this median salary and in this

578:09

case drag it into the values and now

578:11

unlike power pivot where it did in that

578:13

same column we're now filtering down to

578:15

do it by well the appropriate job titles

578:19

now we could also do something like drag

578:21

the the job count into the values as

578:23

well and actually see the job count

578:26

there now both of these measures are

578:29

explicit measures because we explicitly

578:33

defined it we despine defined what job

578:35

count is and what median salary is so

578:38

what is an implicit measure well you

578:40

actually created this before so in

578:43

regards to that job count we're doing a

578:45

count of the job title short column if I

578:48

were to drag that down into here you can

578:50

see it says say count of job title short

578:54

this is an implicit measure these are

578:58

great for quick short analysis as we

579:00

demonstrated before you can quickly

579:02

throw something in and generate it and

579:03

you didn't even know your us the

579:05

measures and you were similarly with the

579:07

salary year average if I drag that in

579:09

down here we previously well changing

579:12

this up to actually perform an average

579:14

mov to average from there that was also

579:17

an implicit measure so I think you get

579:19

the point but we're going to see the

579:22

power of this as we go through this when

579:24

we start to make newer measures that are

579:27

actually going to use our explicit

579:29

measures specifically we're going to be

579:31

using our job count in other

579:34

calculations and so these explicit

579:36

measures are going to save our butt and

579:38

save us so much time and ensure we're

579:40

doing the correct

579:44

calculations so let's get into our first

579:46

calculated column and we're going to be

579:48

going back into the power pivot window

579:51

for this we're going to be creating a

579:54

column that will convert the salary year

579:57

average values into Euro values so

580:02

there's a couple ways we can do this or

580:03

add these columns we can go under design

580:06

and right here under columns we can

580:08

click add to add a column additionally

580:11

without that that unselected selecting

580:13

back in into again you see this add

580:15

column up here we can just go right in

580:17

and add a column I feel that's actually

580:18

easier anyway in this case in order to

580:21

get the Euros value of what it is for

580:24

Sal year average we need to multiply by

580:26

a conversion rate so inside of here I'm

580:29

going to put the equal sign and we see

580:31

it's popping up here in the formula bar

580:33

from there I'm going to use the value in

580:34

salary year average I just selected one

580:37

of the values and it popped right in

580:39

then from there similar to how we wrote

580:42

formulas before I'm going to put times

580:45

0.9 enter now notice from this one I

580:48

didn't use the colon equal sign right

580:50

because is not a measure it's a

580:52

calculated column and it still knew that

580:55

this was a currency although I don't

580:57

like it it has two decimal places so

580:59

I'll remove it and to me it knows it's a

581:02

currency but it doesn't know that it's a

581:03

Euro so I'm actually going to convert it

581:05

over to Euro and then remove the two

581:07

decimal places additionally I'm going to

581:08

rename this from calculated column one

581:11

to salary year Euro you can identify

581:15

calculated columns because Normal

581:16

columns are green the calculated columns

581:18

are black also if I go to the DI diagram

581:21

view we can see that well you can't

581:24

really tell that we have the calculate

581:25

column C Euro but you can see your

581:28

different measures that you've created

581:30

all right so back to the data view even

581:32

though we have this calculated column we

581:35

could also create a measure on this

581:38

calculated column clicking in the box

581:40

below here and then typing in here I

581:42

could do something like median salary

581:44

Euro and then put in that median

581:47

function for salary year Euro and then

581:50

close the parenthesis and Bam now we

581:53

have it I'm going to spread it out to

581:54

actually see we have the value of €

581:57

103 now going back into here we can take

582:00

this and actually if we wanted to we

582:02

could put the salary year euro into

582:04

there that column it's going to

582:05

aggregate it appropriately right now

582:07

it's doing a sum so if I wanted to I

582:09

could get a average of this of these

582:13

values or we could actually go to that

582:16

measure that we created That explicit

582:18

measure throw it in here and we get the

582:21

explicit value of the median salary

582:25

Euro all right so let's shift our focus

582:27

on this analysis let's say we wanted to

582:29

analyze more around the date

582:32

specifically the day of the weeks for

582:35

when job postings are occurring well

582:37

let's go back into Power pivot and

582:40

manage to open up the power pivot window

582:42

right now investigating our the diagram

582:44

view of our data model we only have one

582:46

table in here data jobs salary well if

582:50

we go under the design tab talked about

582:52

in the last lesson we can actually

582:56

create a date table I could also

582:59

potentially mark this table of do job

583:01

salaries a it's not a date table so we

583:03

actually need to create one and you'll

583:04

see what it looks like after that and

583:06

with that I did click new on this anyway

583:09

it created this new table called

583:12

calendar and expecting all of the

583:14

different values in here well let's

583:16

actually just get out of this view let's

583:18

actually go to the data view one which

583:21

is pretty cool with it with it what it

583:24

created it created it based on the dates

583:26

it knew what was in our original table

583:28

so from the first of 2023 all the way to

583:31

the last day of 2023 and with this it

583:34

has a year column month day of week and

583:38

day of week number so a lot of great

583:40

values from it now we need to actually

583:42

connect these two there's no

583:44

relationship between the two if we go to

583:47

that data jobs salary so selecting it

583:50

here here we only have this job posted

583:53

date column which is a date and a time

583:57

so we need only a date so because this

583:59

column is named inappropriately I'm

584:01

going to change it to J job posted date

584:04

time so now let's create that new column

584:06

with that job posted date time this time

584:08

though instead of clicking add column

584:09

we're going to go to insert function and

584:12

this is pretty neat because it allows us

584:14

to actually look under different things

584:16

in this case we wanted sort of a text

584:18

function and we can look and explore

584:20

different one specifically I know we

584:22

want this one a format converts a value

584:24

and text to the specified number format

584:27

so I'm going to click okay and it

584:29

automatically fills it in with this

584:31

colon and equal sign of format equal to

584:34

from there I'll select the job posted

584:36

date time column that's the value and

584:39

then what do we want for the format well

584:41

I know we want in the format of

584:43

basically the year first then two months

584:46

or two M's and then two D's for month

584:48

and date in order to match close that

584:51

double quote because that's the actual

584:53

format we're using that's all we need so

584:56

we'll close the parentheses and press

584:58

enter and then I'm going to take this

585:00

calculated column one drag it over here

585:03

and then I can see that it did convert

585:04

it correctly so I'm also going to go now

585:06

and rename this appropriately to job

585:09

posted date press enter so now let's

585:12

create a relationship between the two

585:14

remember we can go to that diagram view

585:16

or I can use this of create relationship

585:19

go to calendar to match on the date

585:21

itself let's see what it looks like in

585:24

that actual diagram view we always want

585:26

to inspect it to make sure we have this

585:28

right one to many or one to one anytime

585:31

we have many to many you need to start

585:33

questioning it depending on what the

585:35

data is anyway we now have a

585:38

relationship established with

585:42

this so let's actually get into

585:45

analyzing this with our calendar based

585:48

on this day of the week and seeing what

585:50

is the prop portion that they're turning

585:52

out during the week for job postings so

585:55

closing out the power pivot window I'm

585:56

going to go in and create a new sheet

585:59

from there I'm going to go go insert

586:01

pivot table from data model we're going

586:03

to do it in the existing worksheet

586:05

underneath calendar Underneath more

586:08

Fields I'm going to drag in day of week

586:11

into the rows so it has Sunday all the

586:14

way to Saturday then from there remember

586:16

we created that job count already so I'm

586:19

going to take that and drag that into

586:21

the values so looking at this I can see

586:25

that I think our relationship is not set

586:28

up properly cuz we have basically the

586:29

blanks at 32,000 I think I know what's

586:32

going on with this let's go back into

586:34

the power pivot window in calendar when

586:36

we select the date it's of the time data

586:39

type date it also has this format of

586:41

date and time I don't that really

586:43

matters too much but if we go into Data

586:44

job salary and we go to that job post to

586:47

date because we use that format function

586:50

right now the data type is auto of text

586:54

we need it to be of date and this now

586:58

looks a lot more similar to what does on

587:01

the calendar now when I close out of

587:03

this bam all the values pop up here so

587:07

don't forget about your data types and

587:08

making sure they're match within the

587:10

data model so let's actually visualize

587:12

this by inserting a pivot chart and Bam

587:15

we get this bad boy which we'll rename

587:17

to to when are most jobs posted during

587:20

the week and it looks like we have well

587:22

on Saturday Sunday or the lowest

587:24

obviously during the week it's the

587:25

highest with a basically a higher amount

587:27

on Wednesday so pretty cool analysis

587:30

that we were able to do based on the day

587:33

of the week we didn't have to create any

587:35

additional things and additionally we

587:38

can evaluate based on this calendar

587:40

table created we can do other analysis

587:42

such as by the year month day of the

587:45

week and whatnot all right so that's a

587:47

brief intro into measures and also

587:50

calculated columns don't worry too much

587:53

if you're not feeling too confident with

587:55

them just yet as one you have some

587:57

practice problems to go through to get

587:58

more familiar with it but the next

588:00

lesson will be and the next two lessons

588:02

will be on Dax and Dax advance in order

588:05

to explore different formulas that you

588:07

can also use inside of your measures and

588:10

also calculated columns all right with

588:13

that I'll see you in the next one where

588:15

we're getting into deck see you there

588:21

welcome to this lesson on Dax or data

588:24

analytical Expressions we' used it a few

588:26

times before in the previous lesson but

588:29

now we're going to go much more in depth

588:31

and actually understanding the basics of

588:33

it now as we've learned Dax can be used

588:35

within measures or even calculated

588:38

columns for the purpose of what we we

588:40

going through in the project we're not

588:42

going to create any calculated columns

588:44

but we will be using it for measures for

588:47

this we're going to be focusing on three

588:48

major types of functions in this lesson

588:51

specifically around aggregation

588:53

statistics and also filter these

588:55

functions you're going to notice are

588:57

very similar to your Excel functions

589:00

that we did back in Chapter 2 so a lot

589:03

of those similarities and concept we've

589:05

learned already are going to be able to

589:06

be applied to this so we'll be able to

589:08

move pretty quick now we're going to be

589:09

answering two major questions regarding

589:12

our final project the first one involves

589:15

calculating the number of skills

589:18

required per job title we're going to

589:21

use Dax in order to calculate this and

589:23

then we're even going to go on to

589:25

actually graph this to show how it

589:28

correlates with median salary spoil

589:31

alert the more skills you have the

589:33

higher median salary you can expect from

589:35

there we're going to go into a deeper

589:37

analysis of salar specifically looking

589:39

at the median salary and specifically

589:42

being able to compare it from your home

589:45

country to the US and also non us

589:48

countries so we're going to use filter

589:50

function in order to be able to view

589:52

these things within a pivot table now

589:54

jumping right into Excel for this you

589:57

can continue working in the Excel file

589:59

that you have from that first lesson on

590:02

power pivot intro where we created this

590:05

visualization right here which analyzes

590:07

top skills of data nerds and has some

590:08

filters for job title and Country if you

590:11

don't happen to have that file anymore

590:12

or you got lost along the way you can

590:14

just use the power pivot intro part two

590:17

file and you can start from there now if

590:19

you're loading it via the power pivot

590:21

intro part two file you're going to have

590:23

two sheets in there one skill job

590:25

analysis and then also the skill

590:27

analysis we're not actually going to be

590:28

using the skill analysis so you can feel

590:30

free to delete this or conversely if

590:33

you're working from the files that

590:35

you've been building up during this and

590:37

didn't necessarily load from the power

590:39

pivot intro part two file you may have

590:41

multiple tabs in there once again I only

590:44

care about this skill jobs analysis

590:46

where we have this this is what we're

590:47

going to keep for the final project the

590:49

job analysis and and also this other one

590:51

that we created back in the power query

590:54

lesson we're actually going to be

590:55

recreating it with power pivot so both

590:58

of these I can just delete or anything

591:00

else you have in there you can f it free

591:01

to delete after holding control and

591:03

selecting both of those I'm just going

591:04

to delete

591:08

them all right so we're going to be

591:09

looking at aggregation functions first

591:12

conveniently Microsoft has some

591:13

documentation around the Dax functions

591:16

and also statements that they have so

591:18

I'm going to dive right into the link

591:20

that's provided on the screen underneath

591:22

Dax functions specifically I'm going to

591:24

go into the aggregation functions they

591:27

have this page here on aggregation

591:29

functions overview and it shows a lot of

591:32

the different functions they have for

591:33

this average count Max Min sum let's

591:36

look at count real quick count is pretty

591:39

simple all we're going to do is use the

591:41

following syntax count and inside of it

591:44

you provide a column and for this it

591:46

says Hey the column that contains the

591:48

values to be counted so pretty simple

591:51

function to use similarly we have

591:53

distinct count which has the similar

591:56

syntax of you provide distinct count and

591:58

the column and the column that contains

591:59

the values counted and it will return

592:02

the number of distinct values in columns

592:04

we're going to use this so what we're

592:06

going to be calculating with those

592:08

functions that we just went over is

592:10

trying to find out how many skills per

592:13

job we're going to first go through

592:14

based on a job title and find not only

592:16

the skill count but also the job count

592:19

and then we're going to take both these

592:20

values and divide them to get the skills

592:24

per job so I'm going to create a new

592:25

sheet for this and inside of here I'm

592:28

going to insert in a pivot table from

592:30

our data model we're going to do in the

592:32

existing worksheet for the rows we're

592:34

going to go through the do data job

592:36

salary table and we're going to put that

592:37

job title short into the rows and then

592:40

now we need the skill count remember we

592:44

could go in and do something and create

592:46

an implicit measure by throwing job

592:48

skills and the values we want an

592:50

explicit measure because we're actually

592:52

going to be using the skill count in a

592:54

later calculation to find that skill for

592:56

job anyway how do we do this well we can

592:57

also not only create a measure by going

592:59

to power pivot and underneath here going

593:01

to new measure you can also just select

593:04

in here which table you want to use in

593:06

this case I'm doing a skill count so I

593:08

want to contain it in the data jobs

593:10

skills table doesn't really matter which

593:12

table I'll put it in but I just go by my

593:14

memory of which one I'm going to know to

593:16

go look at for which in there it auto

593:18

selects that table of data jobs skills

593:21

the measure name is going to be skill

593:23

count and then for the formula itself we

593:26

want to do a count of the job skills

593:29

column from the job skills table make

593:32

sure it's not from the job salary table

593:34

okay I'm going to put a closing

593:35

parenthesis on this and then for this we

593:37

do want to format it to use a th

593:40

separator and zero click okay and now in

593:42

the data job skills table we have this

593:45

explicit measure can drag it right next

593:47

to it same values are getting created as

593:49

the implicit measure so I'm going to

593:51

take out that implicit measure next

593:53

thing you want to calculate is that job

593:55

count we're going to be counting it

593:57

based on the distinct values of the job

593:59

ID so I'm going go to add measure we're

594:01

going to call this one job count and

594:03

we'll do a distinct count of we want to

594:06

do it of the job ID column and for this

594:09

one we want to make sure that we're

594:10

actually doing it from the salary or

594:12

data jobs salary table because this has

594:15

all the job IDs in it once again we're

594:17

going to format as a number with 1,000

594:19

separator and click okay and then I'm

594:21

going to drag at the bottom the measure

594:23

is going to appear I'm going to drag it

594:24

into here so now we want to get how many

594:27

skills per job so we want to take the

594:29

skill count column and divide it by the

594:32

job count column this one doesn't really

594:34

matter too much because it contains both

594:36

of them but I'm going to put this in the

594:37

data jobs skills table I'm going to call

594:40

this skills per job now what's great

594:44

about these explicit measures that we

594:46

just created is I can go hey I want to

594:48

do this skill count and I want to divide

594:50

divided by the job count and it's right

594:53

there so you don't have to necessarily

594:55

write out every single time okay I want

594:57

to do a count of the job skills column

595:00

and then divided by a count of the job

595:03

ID column which actually needs to be a

595:06

distinct count anyway this is where we

595:07

run into errors that's why the explicit

595:09

meas are so measures are so great all

595:11

right so I have skill count divided by

595:12

job count I'm going to create it as a

595:13

number and I want one decimal place for

595:16

this go ahead and click okay and then

595:19

we're going to add this skills per job

595:20

two here now I'm actually going to

595:22

recommend although we just use the

595:24

division sign I'm going to actually

595:26

recommend this divide function with it

595:28

which is a ma math function and what

595:30

would you do in this case is you would

595:32

provide divide and you list a numerator

595:35

and a denominator and the reason why I

595:37

like this is because it fixes any type

595:41

or catches any error specifically it

595:43

performs Division and returns alternate

595:45

results or or blank on division by zero

595:49

so we're not going to necessar error out

595:51

if we have a division by Z zero issue

595:53

and you can actually provide as shown

595:55

down here in the alternate result the

595:57

value return When division by zero

595:59

results in an error so you could

596:00

actually catch that any so I'm go going

596:02

to go back into that skills per job and

596:05

I'm going to go to edit measure I'm

596:06

going to change this to divide specify

596:09

the first and second parameter with a

596:11

comma and then click okay okay overall

596:15

no real change here but just a best

596:17

practice to know about

596:21

so now with this skills per job I want

596:24

to actually get in and comparing this to

596:26

median salary this is what we're going

596:27

to be building right here we're going to

596:29

be comparing it to median salary and

596:31

then graphing it in a scatter chart in

596:33

order to see how these different job

596:35

titles correlate to each other so first

596:37

so to know what the final analysis is

596:39

going to be of this I'm going to rename

596:41

this sheet appropriately specifically

596:43

calling it salary vers skills and this

596:47

pivot table here we don't need

596:48

necessarily the skill count or the job

596:50

count we just need the skills per job

596:53

okay we're going to calculate now the

596:54

median salary and median is a

596:57

statistical function which is

596:59

encountered underneath here but there's

597:01

a lot of different options underneath

597:03

here such as Med median finding the

597:05

different percentiles like we did back

597:06

in the formulas looking at things like

597:09

standard deviation and whatnot so a lot

597:11

of good statistical functions that you

597:12

have access to Via Dax so for this

597:15

measure I'm just going to come up here

597:16

to power pivot go under measures and

597:17

select new measure I do want this in the

597:19

data job salary table and we're going to

597:22

call this median salary for this we're

597:24

going to be using the median function

597:27

and we need to provide it a column

597:29

specifically that salary year average

597:32

value for formatting we're going to

597:34

format it as a currency with zero

597:36

decimal places since it's a salary so

597:38

now we have median salary here I

597:40

actually want it to appear on the Y AIS

597:43

so I'm going to throw it over here on

597:45

the First Column so now we have the

597:47

median salary and skills per job I'm

597:49

just going to rate these or sort these

597:51

from highest to lowest to see if I can

597:54

see visually if there's anything going

597:56

on with a correlation right now I am

597:58

seeing some higher skills than uh with a

598:01

higher salary but let's actually

598:03

visualize this so I'm going to insert

598:05

pivot chart and select PIV pivot chart

598:08

for this we want to enter a scatter plot

598:12

and if you remember back from our charts

598:13

lecture we're going to have issues with

598:15

this you can't create this chart with

598:17

the data inside the pivot table doesn't

598:19

natively support creating Scatter Plots

598:22

kind of annoying if you ask me anyway

598:24

let's X out of this and for this what

598:26

we're going to do is we're just going to

598:28

set this area starting up here we're

598:30

going to set it equal to this entire

598:33

table right here I'm not going to

598:35

capture the grand total at the bottom

598:37

because we're not going to be plotting

598:38

that now with these values I'm going to

598:40

select the contents in that this column

598:42

f and g and then from there go insert a

598:45

scatter plot specifically this one right

598:48

here I can see it already looks pretty

598:49

good you can't actually add the data

598:51

labels in whenever you create this chart

598:53

we actually have to go about doing that

598:55

somewhat manually specifically we have

598:57

to select on the data points and then

598:59

rightclick it and we have to select add

599:02

data labels okay now it's giving us

599:05

points which bar which actually

599:07

correlate to the skills per job point

599:09

it's not what we want we want to include

599:11

the job title we're going to add that so

599:13

we're going to do is select one of those

599:15

values and just rightclick it and then

599:17

from there select format data labels

599:20

then the pane's going to open up on the

599:21

right hand side and it should pop you up

599:24

underneath label options label options

599:26

then this label options and right now we

599:28

have this y value selected that's not

599:30

what we want we want value from cell and

599:33

it says Hey select the data label range

599:36

what we want is right here all the way

599:38

going down it's hidden behind here I'm

599:40

going to sort of guess but I know it

599:42

goes down to E11 click okay and

599:44

scrolling it over bam we got all those

599:46

data labels on there now all right so

599:49

now we need to clean this bad boy up

599:51

because well it's a hot mess that is all

599:53

up in the upper right hand quadrant

599:55

labels are overlapping we're going to

599:57

fix all of this first thing is I'm going

599:58

to correct the axises so I'm going to

600:01

click on the y or click on the x axis

600:04

and it should go immediately to this

600:05

minimum axis underneath access options

600:08

and I can see the first value stops

600:10

around or begins around 880,000 so I'm

600:13

going to change this to that and press

600:15

enter okay similarly I'm going to select

600:18

the Y AIS and if doesn't go to it should

600:20

be under access options inside that

600:22

format access Pane and I'm going to

600:24

select this first value that I want to

600:26

go to is three I'll leave the default of

600:28

nine there next thing is we need some

600:31

axis labels for the y axis we'll call

600:34

this average skills requested for the

600:36

x-axis we'll call this median salary and

600:38

we'll specify the units of USD speaking

600:40

of which this is not formatted correctly

600:43

for how we want the numbers so under

600:46

that format access pane under access

600:47

options and under access opt options

600:50

again under number we can go to the

600:53

custom option specifically you should

600:55

have this type hopefully appearing up if

600:58

not you can just enter it into this

600:59

format code below and then press enter

601:02

all right the last two things to do is

601:04

rename the title naming it do more

601:06

skills equal more money for data nerds

601:09

which from this chart it looks like it

601:11

does and we can actually confirm this if

601:14

we want by adding a trend line now

601:16

there's different options here for trend

601:18

lines we've going over linear

601:19

exponential IAL linear forecast I feel

601:22

linear best meets this need here also

601:24

like the coloring aspect of it so we're

601:26

going to go with that all right the last

601:27

thing to do is just fix some of these

601:29

names on here so right now we have the

601:33

data La labels appearing to the right of

601:36

the data point and in cases where it's

601:39

close so data senior data scientist it's

601:42

too close to the edge and so it's just

601:44

sort of over the top of it anyway what

601:46

you can do is actually select it twice

601:48

so click it twice then you can drag and

601:51

drop it and it should have these arrows

601:55

or these connectors that connect the

601:56

name to where it goes to all right so

601:59

now we have our final

602:02

visualization and I'd say it's not too

602:05

bad some things I'm noticing about this

602:08

some correlation if you notice yes we do

602:11

see the average skills requested are

602:13

going up with the salary but those jobs

602:18

I mean if you you can pretty much see it

602:19

they div iding line those jobs that end

602:21

an engineer Vice analyst or scientist

602:24

are commanding or requesting more skills

602:27

but yet have sort of a similar pay to

602:31

their data analyst or scientist

602:33

counterparts so I don't know I guess it

602:35

kind of pays to be a data analyst and

602:37

not a data engineer don't tell my data

602:39

engineer friends I said

602:43

that all right last analysis we're going

602:45

to get into is using filters to actually

602:50

aggregate so in this case right here

602:51

we're showing what we're going to get to

602:53

the final thing of based on a job title

602:56

short value what is the median salary in

602:59

this first column for the us then what

603:02

is the median salary for non us and then

603:06

finally that final column of median

603:08

salary what is the median salary of in

603:11

this case the selected column is uh

603:13

Argentina it's filter down basically I

603:16

call this filter function we're going to

603:17

go over but we're going to be

603:18

calculating or figure out how to prevent

603:21

filters from affecting a visualization

603:23

so we can get core values what we may

603:26

want so we're going to create a new

603:28

sheet and I'm going to call this salary

603:30

analysis like before we're going to

603:32

insert a pivot table from our data model

603:34

insert it into this new sheet and we're

603:36

going to be putting that job title short

603:38

into the rows now we're obviously with

603:41

this going to be calculating median

603:42

salary so I'm going to go ahead and just

603:44

drag that into the values to start

603:46

getting those median salaries

603:47

additionally we're going to want to

603:48

include a Slicer in here so based on the

603:52

job country so I'm going to insert

603:53

slicer on job country click okay and

603:57

then with this we can actually see if we

603:59

select something like Argentina it's

604:01

going to filter down to what it is or

604:04

what the salary median salary is in

604:06

Argentina but remember we're trying to

604:08

add two columns to this so we can

604:10

compare these values of something like

604:11

Argentina to us salaries and maybe non

604:15

us salaries so basically countries

604:17

outside the US anyway we're going to be

604:19

using filter functions for this and for

604:22

warning on this it says it here the

604:24

filter and value functions in Dax are

604:26

some of the most complex powerful and

604:29

differ greatly from Excel functions so

604:31

there's going to be a little bit of

604:32

complexity here in understanding this

604:34

and for this filter function we're going

604:36

to be using this one on calculate and

604:39

what it does is it evaluates an

604:41

expression in a modified filter context

604:45

calculate is pretty simple in my opinion

604:47

first you provide an expression so such

604:49

as hey perform a count of this column or

604:51

a median of this column from there you

604:54

provide a filter or filters and as it

604:57

states below here filters can be Boolean

605:00

filter Expressions table filter or

605:01

filter modification functions main thing

605:03

is here we're going to use things like

605:06

logical operators in order to compare

605:08

this to maybe a certain value we're

605:10

going to expect so let's jump into

605:13

creating our first one with median

605:15

salary evaluating for median salary in

605:18

the United States

605:20

so I want to create this measure inside

605:22

of our data job salary column sorry data

605:24

job salary table and for this we're

605:27

going to call it median salary us we're

605:31

going to be using the calculate function

605:33

for this and inside of here we're going

605:35

to insert the ex an expression so in our

605:38

case the expression is the median of the

605:41

salary year average column and what

605:45

we're going to do actually I'm just

605:46

going to leave this is cuz filter is

605:48

optional we can tell filter is optional

605:49

based on the square brackets around it

605:51

I'm going to just close out this

605:53

calculate function change this to a

605:55

format of currency with zero decimal

605:57

places and then from there take that

605:59

median salary us and actually drag it

606:01

onto here so right now calculate is

606:03

working by calculating the median salary

606:07

and there's no filters applied to it so

606:09

pretty simple so let's go in and

606:11

actually edit this measure now now

606:13

remember we have an explicit measure of

606:15

median salary so I actually don't even

606:18

need to Define it like I did here I can

606:21

actually just call out median salary in

606:23

this case kicking okay still the same

606:26

value going back in and actually editing

606:28

it we now want to apply a filter

606:32

specifically for this filter we want to

606:35

make sure that the job country column is

606:38

equal to United States so I'm going to

606:41

type in job country and we can use

606:44

logical operators so I'm going to use an

606:46

equal sign right next to this and I'm

606:48

going to specify United States need make

606:51

sure it's spelled exactly right I know

606:53

it's that via the column okay so now

606:56

we're going to leave everything out El

606:57

as is click okay and Bam now it has the

607:02

median salary filtered by the US and I

607:05

can confirm this by scrolling down to

607:07

the United States clicking United States

607:10

and seeing that these values are the

607:11

same but no matter what I actually click

607:14

the United States median salary is going

607:16

to stay the same additionally if you

607:19

noticed here when I click on something

607:20

like the US virsion Islands would am I

607:22

moveing there they only have four job

607:25

titles available so because of that they

607:28

just filter this table down to only show

607:31

those four that are applicable it along

607:33

with their applicable salaries in median

607:35

salary in the US so now let's calculate

607:38

the median salary for non us countries

607:40

and actually see how they differ so come

607:42

into D job salary select add measure for

607:45

this we're going to be using non us

607:47

values once again we want to use that

607:48

calculate fun function on the median

607:51

salary measure that we created and for

607:54

this one we're still evaluating the job

607:58

country but we want it not equal to so

608:00

we're going to use basically a less than

608:02

and greater than sign right next to each

608:03

other say not equal to and we'll say

608:06

United States we're going to format this

608:08

as a currency with zero decimal places

608:10

click okay and then add this bad boy to

608:13

the values and I want to actually see a

608:16

country with more job postings in it so

608:19

we'll go to something like Australia and

608:21

now something like Australia we can see

608:24

one comparing us to non Us in general

608:27

the US well except for data Engineers

608:30

yeah it looks like only data Engineers

608:32

are the lowest one in another country

608:33

everything else is higher in the US but

608:35

now we can with this one compare hey

608:38

what does it look like something like

608:39

Australia compared to us and non us

608:42

countries so super useful in actually

608:44

filtering down providing the right

608:46

context for what we want to look at so

608:48

as a data analyst median salary is

608:50

around 100,000 Which is higher than us

608:53

and also any other non us median salary

608:55

so may have to move to Australia one

608:58

last clean up right quick slicer itself

609:00

I don't like it to say job country we're

609:02

going to name this to country all right

609:04

now wrap up the analysis for this all

609:07

right so you now have some practice

609:08

problems to go through and test out

609:10

these different Dax functions that we

609:12

just went through along with some others

609:14

now in this lesson we just did some

609:15

basic dacks in the next one we're going

609:17

to be moving into some more advanced

609:19

Stacks features that I do find myself

609:21

using from time to time but overall most

609:24

of the stuff we apply in this lesson I

609:26

use dayto day all right with that see

609:28

you in the next lesson we'll be wrapping

609:29

up basically our final question in our

609:31

project and be done with our project see

609:33

you

609:37

there all right welcome to the last

609:39

lesson in this course where we're going

609:41

to be going over more advanced decks

609:44

specifically we're going to be focusing

609:46

more in depth on fil fter and also

609:51

relation or relationship type functions

609:55

these are going to be needed by our data

609:56

model in order to calculate what is the

609:59

salary or median salary for an

610:01

Associated skill if you remember back to

610:04

a few lessons ago we had relationship

610:06

issues I know I feel that with having

610:09

them being able to filter tables in

610:11

certain directions and we're going to be

610:13

able to see that and fix that in this

610:15

lesson

610:19

so in this lesson you can start with

610:20

some the workbook from the last lesson

610:23

or if you got lost dur in the way you

610:25

can go into the Dax intro workbook now

610:28

let's do a quick overview of where we're

610:30

at with which analysis we've done for

610:32

this project we've identified what are

610:35

the top skills of data nerds along with

610:38

different filters to filter for whatever

610:40

our interest is in my case I'm looking

610:42

for data analyst in the United States

610:44

and I can see that se SQL Excel and

610:46

Tableau are some of the highest

610:47

additionally we've zoomed out a little

610:49

bit and been able to identify based on

610:52

job titles where our job title of

610:54

Interest Falls compared to others and

610:56

how many skills it requires for data

610:59

analysts it's right above did business

611:01

data analysts and based on the number of

611:02

skills it looks like it's appropriately

611:04

rewarded for the median salary and then

611:06

final thing we did was be able to

611:08

analyze additionally Based on data

611:11

analyst we can look at different

611:12

countries and compare it not only in

611:14

that country but to within the US and

611:16

outside the US so a lot of good stuff

611:19

related to well data analyst that

611:22

position and analyzing the salary but

611:24

what about skills well we haven't done

611:27

that yet we're going to get into

611:29

actually analyzing in this first portion

611:32

analyzing what is the expected median

611:34

salary based on one of the top 10 skills

611:38

we did this back in the power query

611:40

lesson but now we have this new data

611:41

model we need to recalculate it anyway

611:43

we're going to run into some issues with

611:44

the data model as we're going to find

611:46

out additionally we're going to be

611:47

calculating the skill likelihood instead

611:50

of skill count basically finding the

611:53

percentage of a skill in a job posting

611:56

this is somewhat complex so this portion

611:58

here will be optional and you'll be able

612:01

to use job count instead if you don't

612:02

want to follow along with this skill

612:06

likelihood anyway back in your workbook

612:09

whether you started from that uh Dax

612:11

intro or you're continuing on with from

612:13

the last lesson we're going to create

612:15

this new sheet for this and for this

612:18

we're going to name this

612:19

skill salary analysis as usual we're

612:22

going to go in and insert in a pivot

612:24

table from our data model so we can get

612:26

into analyzing the skills going click

612:28

okay insert it in and so for this I want

612:31

to analyze what is the median salary for

612:34

a skill so if I drag the job skills from

612:37

the data jobs skills table into the rows

612:40

we have all the different skills pop up

612:43

underneath here and then if we went up

612:46

here and then tried to drag or we will

612:48

be dragging in the median salary into

612:51

here all these values are going to be

612:54

the same addition we get this popup

612:56

right here that relationships between

612:57

tables may be needed basically we're

612:59

running into an issue with our data

613:01

model even if I click autod detect it's

613:03

going to tell me no new new

613:04

relationships are found so what's going

613:06

on here well let's actually analyze our

613:09

data model by going to manage and then

613:12

inside of here go into diagram view so

613:15

the air resides with their filtering

613:18

dire remember this Arrow right here

613:21

signifies which way we can actually

613:24

filter our data so in our case we have

613:27

job skills which is over here in the

613:30

data job skills table and we're trying

613:32

to find the median salary the problem is

613:36

is we're basing that off of that salary

613:37

or average value that's in the data jobs

613:40

salary table and based on the direction

613:43

of this Arrow we cannot flow in the

613:46

opposite direction this is what we're

613:48

call oneway way or single filtering now

613:51

unfortunately Excel doesn't support bir

613:54

directional filtering however in things

613:56

like powerbi you can actually go in and

613:59

change it from single filtering to both

614:03

or bir directional filtering kind of

614:05

makes me wish I was in powerbi right now

614:08

so back in Excel we can't actually

614:10

control this via here and actually click

614:13

it to change this to bir directional

614:15

filters we can only control the

614:16

relationship itself but we we can use

614:19

Dax to fix this now in order to fix this

614:22

relationship we actually have

614:24

relationship functions inside of Dax

614:27

specifically we're going to use this

614:29

cross filter function with this function

614:32

you put inside of cross filter the

614:34

column names so in our case we can

614:37

specify basically the job ID from job

614:39

salary and the job ID from data job

614:41

skills and then from there we specify

614:43

the direction which the parameters under

614:46

here we can go into what we can provide

614:48

to directions we can either provide none

614:50

basically don't create a relationship

614:52

both which is what we want filters on

614:54

either side or one way which is what we

614:58

have already we're not going to use this

615:00

you also control filters left or filters

615:02

right the one way we're also not messing

615:04

with that we want both now this cross

615:07

filter remember is a filter function so

615:10

we need to use this in an appropriate

615:12

for formula that we already know

615:14

calculate in order to filter so I'm

615:16

going to x out of this box right here

615:18

cuz that's not applicable

615:19

what we're going to do is I'm going to

615:21

calculate median salary or a new median

615:24

salary if you will inside of the data

615:26

jobs skills table and because it's uh

615:30

going to use the same name but we're

615:33

going to keep it in a different table

615:34

it'll be perfectly fine and then for

615:37

this remember we want to use still

615:39

calculate we want to have an expression

615:41

in here in our case we want to calculate

615:44

what is the median salary and we'll just

615:47

use the explicit measure that we already

615:49

defined then from there we'll get into

615:52

the filter one of what we want to

615:54

actually filter we want to provide for

615:56

this cross filter and for this we're

615:58

going to specify the job ID of one table

616:01

along with the job ID of the other table

616:05

then for the filter type we're going to

616:07

use both okay I'm going to go ahead and

616:09

close this now we're calculating median

616:11

salary so I want this formatted as a

616:13

currency with zero decimal places I'm

616:15

going go ahead and click okay and have

616:17

an error in my formula should have known

616:19

that by the X I need to actually put a

616:21

closing parentheses on here and I'll

616:23

lied to you a measure a column with the

616:25

name median already exists okay I

616:27

thought we could do that it's Sil me so

616:29

we'll name it median salary skills go

616:31

ahead and click okay okay now I'm going

616:34

to drag this into the values and we can

616:37

actually see with this one now that the

616:39

associated median salaries are actually

616:42

there and it's not all that 115,000

616:45

which is basically the median of the

616:46

entire data set so I'm going to go ahead

616:48

andove move this other median salary out

616:51

of here and from there we're going to

616:53

also drag skill count into here I just

616:56

want to look at the top 10 most common

616:59

skills in this case so I'm going to go

617:00

up here into our filter and go to our

617:03

value filters for top one dot dot dot we

617:05

want the top 10 items by in this case

617:08

skill count and then from there based on

617:11

these top 10 skills I'm going to sort it

617:14

from largest to smallest but like usual

617:17

this is no good unless we don't actually

617:19

analyze for the country and also for the

617:23

title or job title so if I actually go

617:25

back into that skill jobs analysis I can

617:27

just select these two slices right there

617:30

pressing control then copy it and paste

617:34

them into here now you may notice

617:35

whenever I'm clicking this this is not

617:37

affecting this pivot table right here so

617:40

we can actually inspect this by going to

617:42

the slicer and going to report

617:45

connections right now this slicer is

617:47

only affect ing the skill job analysis

617:51

tab so this one right here in our case

617:53

for this job title we actually want to

617:55

affect it on this page here of skill

617:58

salary analysis which is right down here

618:00

click okay looks like the salary is

618:02

updated also we want to do the same

618:04

thing for Country adjusting the report

618:07

connections for this as well and

618:09

selecting this one right here for

618:10

underneath the sheet of skill salary

618:13

analysis clicking okay bam it updated as

618:16

well so now looking at the top skill of

618:19

data analyst in the United States which

618:20

I'm pretty familiar with I can see

618:22

things like python Oracle and Tableau

618:24

are top three Excel does make the list

618:27

and it's the second to last at 84,000

618:29

now with this I do want a visualization

618:32

with it specifically I want a combo

618:35

chart showing this so I'm going go into

618:37

insert pivot chart pivot chart and for

618:40

this go down to combo for this I want

618:42

the median salary to be the main focus

618:44

and then for the skill count we're going

618:46

to put that on a secondary axis because

618:49

right now it's just way too low if we

618:50

keep it on the same axis and this has

618:52

the format that I want right here go

618:53

ahead and click okay I'm going to hide

618:55

all the field buttons on the chart I'm

618:57

going to add a primary vertical and also

618:59

a secondary vertical axis along with a

619:02

chart title and then for the legend

619:05

itself I'm going to click it and then

619:06

rightclick it and go to format Legend

619:09

and for this it should go under Legend

619:11

options Legend options Legend options

619:13

I'm going to unclick this of show The

619:15

Legend without overlapping the chart and

619:16

I'm just going to move it up here so not

619:19

bad I don't necessarily want this orange

619:22

line right here for the skill kind I

619:24

don't really feel like a line is best to

619:26

signify the count instead what I'm going

619:28

to do is select the line and if it

619:30

doesn't appear the format data series

619:32

you can also just right click it go to

619:33

format data series and then underneath

619:36

fill and line they have line but also

619:39

marker for the line we're going to go no

619:41

line and then for the marker we're

619:43

actually going to change the marker

619:45

options to builtin we'll change it to

619:48

this square is going to be fine or we

619:50

can change it to a diamond we'll make it

619:53

slightly bigger and I don't really like

619:55

the color so I'm going to go into design

619:57

and change the color to this

619:59

monochromatic pallette 8 nope never mind

620:02

not that one I meant monochromatic

620:04

palette one I want the bar charts to be

620:06

more visually popping than the actual

620:08

markers themselves I change the title

620:11

two what's the pay of the top 10 skills

620:13

and then change the primary access to

620:15

median salary USD and the other one one

620:18

to job count closing this out and then

620:21

making some room over here for the

620:23

actual visualization itself so now we

620:25

have our visualization that we want that

620:27

looks at this and be able to show us

620:29

what are the top 10 skills for data

620:31

analyst and their Associated pay now one

620:34

last thing for this regarding slicers I

620:37

want to actually make it to where

620:38

they're connected between the charts so

620:41

right now I have it to where this

620:43

basically this one for skill salary

620:45

analysis tab if I go over to the skill

620:47

job analysis tab select business analyst

620:50

it will change then go go back to skill

620:52

salary analysis it updated to business

620:54

analyst anyway I wanted to if we change

620:56

a slicer to make sure that it changes on

620:58

the appropriate sheets so the job title

621:01

slicer is only on these two sheets

621:04

actually that one's perfectly fine but

621:06

the one we actually have concerns with

621:07

now is the country specifically on this

621:10

one I'm selected on the United States

621:12

the skill job analysis one it's also on

621:14

the United States and updates

621:15

appropriately but then if we look in the

621:17

salary analysis that one's on Australia

621:20

it's not updating appropriately so we

621:21

need to go to slicer report connections

621:24

and we're going to be putting the

621:26

country one on all the different sheets

621:28

so I'm going to go ahead and select all

621:29

the sheets for this I'm going to do the

621:31

same for skill salary analysis country

621:34

slicer which it looks like it updated

621:37

along for the skill job analysis so what

621:38

I'm going to do is actually copy this

621:39

now and put this into the salary verse

621:43

skills because we're controlling it on

621:45

this page as well and so now whatever I

621:48

select select something like maybe

621:49

United Kingdom it will update

621:51

appropriately and update on other sheets

621:55

as well anyway quick one quick note

621:57

because we move those titles around that

622:00

one time sometimes it's not going to

622:02

match up exactly how we had it before if

622:04

you recall I'm going to go ahead and

622:06

select all we set up these text box in

622:09

order to view them whenever basically

622:11

all countries were selected so that is

622:13

one of the issues about dragging and

622:15

dropping those titles and making them

622:16

stick to a certain location it messes it

622:18

up your filters whenever you want to

622:20

filter down for something like the

622:21

United

622:24

States so this wraps up basically our

622:27

four major analysis that we did now I'm

622:30

going to take it a step further this

622:32

portion will be completely optional and

622:36

that's this right now we're using skill

622:40

count in order to look at what is you

622:44

know the skill count of in this case for

622:45

data analyst in we'll do United States

622:49

we see that SQL is around 400 4,000 and

622:53

that Excel is around 3500 but what does

622:56

that actually mean well if we go to the

622:58

Future file of what we're going to get

622:59

to we're actually going to be

623:00

calculating a skill likelihood instead

623:04

which in this case is looking at what is

623:06

the proportion of a skill compared to

623:10

all the different jobs that are

623:12

available for data analysts in the

623:15

United States and so that 4500

623:18

and almost 3500 is equal to well greater

623:21

than 50% for SQL and about 40% for Excel

623:25

so that makes in my mind a lot clearer

623:28

how important that skill is over account

623:31

in that you probably should be learning

623:33

SQL and Excel as a data analyst so back

623:36

in our sheet where we're actually

623:37

calculating with the job count how do we

623:41

calculate this well let's actually get

623:44

to moving this over to here go back into

623:47

our pivot table self and if we throw up

623:50

the job count you may get this

623:53

relationship between toils maybe needed

623:55

don't worry about it too much now these

623:57

values are all stagnant based on some

624:00

issues with the filter Direction but

624:02

that actually comes to our advantage

624:05

because for our filter right here

624:07

specifically data analyst in the United

624:09

States the amount of jobs that actually

624:12

are are

624:14

8339 if I actually remove both of these

624:17

filters we would expect it to be the

624:19

total rows of the column which is

624:23

32672 so coincidentally this is actually

624:26

doing what we need we just need to get a

624:28

percentage of these two values and that

624:30

can be done pretty easy so let's open

624:33

the show field list and actually get

624:35

into creating this measure we're going

624:37

to create in the data job skill table

624:39

we'll call this skill likelihood and

624:41

what this will do is take skill count

624:44

and divide it by job count but remember

624:46

we probably want to use the divide

624:47

function for for this so putting in

624:49

skill count and then job count now

624:52

there's no option to format this as a

624:54

percentage unfortunately so I'm going to

624:55

go ahead and click okay from there I'm

624:57

going to drag the skill likelihood into

624:59

the values and go through and format

625:01

this appropriately selecting that it's a

625:04

percentage and then with this I'm going

625:06

to select something that a value that I

625:07

know what it should be of data analyst

625:09

in the United States and with those

625:11

values selected I can see that Excel is

625:14

at 41% which I know that's what it is

625:17

and SE is at 53% for these values so bam

625:21

we have this skill likelihood now we can

625:24

now go in and remove these other two

625:25

columns of skill count and job count and

625:29

then from here actually move this graph

625:31

back over and unfortunately with the

625:33

adjusting to it we actually have to fix

625:35

this and turn this back into a combo

625:38

chart so we're going to design change

625:39

chart type into combo select for the

625:43

skill likelihood we want this to be on

625:45

the secondary axis click okay go back to

625:48

format data series remove the line and

625:51

then change the marker option to be

625:53

built in and to be that diamond at 6

625:56

point and then finally update that

625:57

secondary access to basically say it's

625:59

skill likelihood and Bam now we have

626:02

this final visualization now there's one

626:03

more that we actually do need to clean

626:05

up and that's this one right here what

626:07

are the top skills of data nerds right

626:10

now we're doing a count of the job ID an

626:12

implicit measure which you know how I

626:15

feel about that we should use an

626:16

explicit measure specifically we're

626:17

using skill likelihood instead of that

626:20

and remove that count of job postings

626:22

once again I need to actually format

626:24

this as a percentage so going to home

626:26

change it to a percentage and then from

626:28

there clicking in it and sorting from

626:31

smallest to largest and Bam for this one

626:34

data analyst in the United States once

626:36

again we can actually see visually what

626:38

are the top skills for this so now we

626:39

just updated both of these charts to

626:41

have a more represen istic understanding

626:44

of what's going on with the data all

626:46

right so you should be super proud of

626:48

what we just accomplished in this

626:50

project going through both power query

626:52

and power pivot and actually diving deep

626:54

to understand some key statistics about

626:57

top paying skills and also top skills

626:59

you should be targeting depending on

627:01

what job you're pursuing and what

627:03

country you're in now do have some

627:04

practice problems go through and test

627:06

out some of these more advanced

627:07

functions specifically this cross filter

627:09

function that we went over then after

627:11

that in the next lesson we're going to

627:12

be getting into how we can actually go

627:14

about sharing this project for those

627:16

that purchase the course practice ice

627:18

problems and also certificate you can

627:20

now go through and complete that end of

627:22

course survey and you'll be rewarded

627:25

this course certificate now if you

627:26

didn't do this it's not too late for you

627:28

to go in and purchase the course so way

627:31

you get this course certificate all you

627:32

got to do is go in and take that Endor

627:34

survey and you'll get it all right

627:36

congratulations on your work so far see

627:37

you in the next

627:42

one all right congratulations again for

627:44

finishing that last project in this

627:47

video and the next video which are the

627:49

last two videos of this entire course

627:52

they're going to be focused on how to

627:54

actually go through and share your

627:56

projects in my recommended way

627:58

specifically we're going to be sharing

628:00

this on GitHub so that way others can

628:02

see it here I am on GitHub and also if

628:05

you didn't notice there where you

628:06

actually downloaded all those Excel

628:08

files at the beginning of this course

628:10

anyway inside of here is where I'm

628:13

hosting my different projects and you've

628:15

gone through and probably seen this but

628:17

you may not have clicked on something

628:18

like the project One dashboard and in

628:21

this case yeah I have the Excel file but

628:23

that read me in there displays below

628:27

this and this is what we're actually

628:29

going to be doing in the next two videos

628:31

to set this up and then create this read

628:33

me and this allows you to detail all the

628:36

different skills that you used along

628:39

with detailing all the different

628:41

analysis that you did while going

628:43

through this now that was Project one

628:45

project two is going to follow a similar

628:48

method and that it has the Excel file

628:50

and the readme and then in the readme

628:52

itself it details all the different work

628:54

that we did in

628:57

it so you may be like Luke why the heck

629:00

am I going to be using GitHub in order

629:02

to share this project I'm not familiar

629:04

with it I don't know how to use GitHub

629:06

at all why am I going to waste my time

629:07

with it well I think it's useful not

629:10

only in Excel but also other

629:12

Technologies specifically programming

629:14

here I have my SQL project for my SQL

629:16

course and this this is where I host my

629:20

SQL code and all the different analysis

629:22

that I did for it and similarly for my

629:25

python course and the project we

629:26

creating that I also hosted on GitHub

629:29

and detailed all the different the steps

629:30

that we did along with all the different

629:33

uh python files associated with it so

629:35

more the story is I think github's a

629:37

great tool to use in order to share your

629:39

work not only in Excel but also other

629:41

tools now if you recall from Project one

629:43

we walk through the steps to quickly

629:45

share your project on one drive if you

629:48

had it accessible via like a paid

629:50

Microsoft subscription and this provided

629:52

a method to go through and share if you

629:55

go up here and actually copy the link a

629:58

usable link for others whether they have

630:00

Excel or not to actually go in and then

630:03

manipulate your dashboards that you have

630:05

so you may be wondering why the heck are

630:07

we not doing this with this second Excel

630:10

file that we created with all of our

630:11

analysis and then sharing it via this

630:13

method well if you're called back to

630:15

this handy Dan table of the different

630:17

Microsoft versions and the different

630:19

skills or basically Technologies within

630:21

Excel that it uses Microsoft online

630:24

which where we hosted that first project

630:26

at doesn't have the capabilities of

630:29

power query or power pivot because of

630:32

that I could go through the process of

630:33

adding the second project to this which

630:36

it's this file right here I'll open it

630:38

up then actually investigating it well

630:41

it does if you investigate all the

630:42

different sheets does go through and

630:44

actually show the analysis that we did

630:48

but if you actually get into

630:49

manipulating it like in this case let's

630:51

say I wanted to see what are the top

630:52

skills of data analyst you're going to

630:54

get this popup right here that says this

630:56

workbook contains external data

630:57

connections or bi features that are not

630:59

supported basically power pivot and

631:01

power query aren't supported it can't

631:03

actually query the data it's just

631:05

showing the basic last snapshot of the

631:07

data right here and you can't manipulate

631:09

it so in this case Microsoft online

631:12

becomes pretty useless so that's why I'm

631:14

recommending sharing it via GitHub as

631:17

you can share all the associated files

631:19

with this if somebody want to they could

631:20

come in here and download it along with

631:22

going through and actually detailing

631:24

what you actually did so basically

631:26

controlling the story line and sharing

631:29

what the different analysis or insights

631:31

that you actually found now this what

631:33

you're reading right now is a read me

631:35

and it requires understanding markdown

631:38

and how to write and markdown so we're

631:40

going to be covering that more in depth

631:41

in the next video when we get into

631:44

markdown and creating the read me this

631:46

video is going to be primarily focused

631:48

on just getting this project into GitHub

631:51

so what are we going to be doing for

631:52

this well we have five major steps we

631:54

need to get through the first thing is

631:55

installing git which is the core

631:57

technology used behind GitHub we'll

632:00

explain more in a bit second and third

632:02

we'll be going through actually setting

632:03

up our GitHub account and then

632:05

installing GitHub desktop to then manage

632:08

with Git our different folders and

632:10

projects and then fourth and fifth we'll

632:13

be basically initializing the repository

632:15

which is a fancy term for a folder and

632:17

from they are getting that folder

632:19

repository onto GitHub to then share so

632:23

before we install it what the heck is

632:25

git well similar to how they have track

632:28

changes and stuff like word and

632:30

PowerPoint git does this it's a Version

632:33

Control System it tracks changes in not

632:35

only files but also code and because of

632:39

all this it allows you also to

632:40

collaborate with others when working on

632:42

a project git is the core technology

632:46

behind maap managing all these different

632:49

things going on on your own local

632:51

computer and then whenever you make any

632:54

of these changes get Hub is where it

632:56

keeps track of these final changes if

632:58

you will and then displays it for the

633:00

world to see and also pull those changes

633:02

so here's my Excel di analytics course

633:05

right here on GitHub and I have the same

633:07

folders or repository on my own local

633:11

computer now there's actually hidden

633:14

folders or git folders in here managing

633:16

this and I can do a shortcut on Mac of

633:19

command shift period to show that but

633:21

anyway I wanted to mainly show this of

633:23

this dogit folder in here and this thing

633:26

I don't necessarily touch this at all or

633:27

work inside of it this.get folder

633:30

contains all the different revisions and

633:33

tracks all the different changes within

633:35

my project so in order to get this git

633:38

folder inside your project and then also

633:42

get it into GitHub we need to actually

633:43

install git

633:48

so navigate over to the git website into

633:50

their downloads select your operating

633:52

system Choice whether Mac OS or Windows

633:54

I want a Windows machine right here and

633:56

from there I'm going to select the

633:58

64-bit version for Windows and click

634:00

here to download Once download I'm going

634:02

to open the file as do I want to allow

634:04

this to make changes in my device yes I

634:06

do and then it's going to walk you

634:08

through the setup process for git all of

634:12

these things are going to be left as

634:14

default so feel free to just go through

634:16

and select it all after I've left all

634:18

the default settings as is and selected

634:20

that it then gets into the actual

634:22

install itself looks like it installed

634:24

properly we'll go ahead and click finish

634:26

we can confirm it's installed by opening

634:28

something like terminal and you should

634:30

have a terminal app installed this is

634:32

just confirming it you don't necessarily

634:33

have to do this anyway mine opens in a

634:35

Powershell and you can just type

634:36

something like get and it shouldn't give

634:39

you an error message it should instead

634:42

give you how you could go about using

634:44

git via the command line in terminal

634:46

don't worry don't be AF of this we're

634:48

not going to be using git via the

634:49

command line although I may need to make

634:51

a separate course on that instead we're

634:53

going to be using GitHub desktop to

634:55

manage

634:59

git so in order to use GitHub you need

635:01

to have an account if you already have

635:03

an account you can feel free to just

635:04

sign right on in but if you don't go

635:07

through the whole process of entering

635:08

your email providing your different

635:10

credentials and then getting logged in

635:12

once logged in it should direct you to

635:14

your homepage if it doesn't you can come

635:15

up here to this icon at the top and from

635:18

there just select your profile I would

635:20

go through at this point and actually

635:22

customize your profile specifically

635:24

adding a picture your name a little

635:27

description and any social media links

635:30

over here on the right hand side of on

635:31

my homepage I have some different pinned

635:34

repositories because you just set it up

635:36

you probably have none but this is where

635:37

we're going to be putting your Excel

635:40

project when you're complete so that way

635:41

if people navigate to your profile they

635:43

can see it now that we have this account

635:45

we need to actually get our project or

635:48

our repository onto GitHub but

635:52

unfortunately there's not really an easy

635:55

method I've found with actually using

635:57

the UI from the website to do this and

636:00

that's mainly because there's a lot of

636:02

technical things going behind the scenes

636:04

and managing

636:08

git instead I'm going to recommend

636:10

downloading github's application to

636:13

install on your computer they have it

636:15

for both Mac and windows navigate to

636:17

this link here and for this we I'm going

636:20

to go ahead and just download the 64-bit

636:23

version of this application this one's a

636:25

lot easier to install than get from here

636:27

once we have it downloaded I'm going to

636:29

open the file the installer should open

636:31

this window for you to next sign into

636:34

GitHub once you've enter your

636:36

credentials for GitHub you'll use this

636:38

to configure git and for this you're

636:41

going to basically say hey I want to use

636:42

GitHub account and name and email

636:44

address to manage all this and click

636:46

finish now it should navigate you to the

636:49

let's go started screen anyway it has

636:51

methods for you to go through and create

636:53

a tutorial repository if you want to

636:55

we're going to be doing that and it has

636:57

some different options for this that you

636:58

can also select via the file menu such

637:00

as a new repository add local repository

637:03

or clone repository we're going to be

637:05

creating a new repository and as a

637:08

reminder repository it's basically a

637:10

fancy name for a folder but it's a way

637:12

for us to maintain and collect all of

637:15

our different files and not for what

637:18

we're using in our project so for this

637:20

we need to give it a name so I'm going

637:22

to give it some descriptive like Excel

637:23

project data analytics and for

637:26

description I'll just give the simple

637:27

one of my project Dem maturing my Excel

637:29

skills for the local path we need to

637:31

actually point it to the folder that has

637:33

this so mine is inside my documents

637:35

folder and real quick inside that folder

637:38

itself right now I would expect you to

637:41

have the project one and project two I

637:44

also going to be putting all the

637:45

different files that I have for the

637:48

different Excel workbooks that we work

637:50

through in the lesson if you don't have

637:51

them don't feel like you need it the

637:53

main important thing is that you have

637:54

both project one and project 2 in there

637:56

and I have them conveniently located in

637:59

different folders inside of here never

638:01

getting out of that so I can select this

638:03

Excel project. analytics folder I'm

638:05

going to select this folder it's going

638:07

to ask if I want to initialize this

638:08

repository with a read me I do as far as

638:11

the get ignore I'll put none and license

638:13

none as well and we'll create the

638:14

repository so now you're going to be

638:16

navigated to this screen here here which

638:17

is basically the default screen of

638:19

GitHub desktop it allows you to select

638:22

different repositories right now I have

638:24

only the Excel project analytics one it

638:27

allows you to select different branches

638:28

we're going to say on one shifting to

638:30

another Branch beyond the scope of this

638:31

course then up here at the top it has

638:33

something like publish repository which

638:35

we want to do but one quick thing real

638:37

quick I can actually investigate what

638:40

files are going to be pushed up to

638:43

GitHub by going here into history and

638:47

right now it's just one I selected that

638:49

box for read me so the readme is in

638:51

there and the other one's just do get

638:53

attributes the other ones aren't in

638:55

there and I'm doing this on a Windows

638:57

machine well if I navigate back to the

638:59

folder that contains my project so here

639:02

I have Excel project. analytics which I

639:04

selected two from the GitHub desktop

639:07

whenever I go into it it actually

639:09

created another folder inside of it and

639:14

that has theget attributes and read me

639:16

that it's talking about about now I've

639:18

done this on both Windows and Mac and

639:21

Mac doesn't cause this issue of putting

639:23

another folder inside your other folder

639:26

so for Mac users you may not have this

639:28

problem so completely ignore this but

639:29

for Windows user this is a problem

639:32

because this right here is the project

639:34

or the folder was going to get uploaded

639:36

to GitHub so what we need to do is take

639:38

all the contents of this by selecting it

639:41

all and just pressing control to select

639:43

it all and then dragging it into that

639:46

folder so a little confusing but if we

639:48

go back to the documents we have our

639:50

Excel project. analytics folder then

639:52

inside of that we have our GitHub repo

639:54

and then now navigating back into GitHub

639:59

desktop I go over here and I see changes

640:02

we have 85 of 8 five different files and

640:04

folders within there it's actually

640:06

picking up on all those different files

640:09

that I have in there once again if

640:11

you're on a Mac you may not see this

640:12

because it's already in there in history

640:15

and you can see it's actually within the

640:17

this portion of the guy anyway the thing

640:20

now is if we go ahead and publish this

640:23

repository to GitHub it's only going to

640:25

have what's inside of our history right

640:28

now under this what we're calling a

640:30

commit and a commit is a snapshot of

640:34

your

640:35

repository at the time that you're

640:37

basically committing it so we need to do

640:40

a commit in order to get all these

640:42

different changes into a repository cuz

640:45

technically right now they're in an area

640:47

called a staging area or the working

640:49

area anyway we need to provide a summary

640:51

that's required and I'm going to add

640:53

something simple like add all Excel

640:55

files doesn't need to be super

640:56

descriptive and from there I'm going to

640:58

click commit to main now if I go into

641:01

history I have this initial commit that

641:03

it did but then that add all Excel files

641:07

it's going to then have in all those

641:08

different Excel files that I added into

641:13

it so now that our local repository on

641:16

your machine is is up to date we need to

641:18

then publish this repository to GitHub

641:21

and we can either click this button or

641:22

this button here for this we're going to

641:23

keep the same name and description that

641:25

we have before we don't want to keep

641:27

this code private so we're going to

641:28

uncheck that box and then from there

641:30

we're going to click publish repository

641:33

so my repository has quite a bit of

641:35

Excel files and the memory size of it is

641:37

pretty large so it is taking a little

641:39

bit of time to do this so now we've

641:42

completed pushing our local repository

641:44

to our remote repository on GitHub so

641:46

inside of GitHub I can navigate up here

641:49

to the right hand side and I go to your

641:52

repositories and here it is the Excel

641:55

project data analytics that we made

641:57

public and it's all in here so now

642:00

somebody can come in here and see our

642:02

different work in this case our project

642:04

One dashboard is inside of here we have

642:06

our Excel file in there and Bam we've

642:09

set up git and also GitHub and that was

642:13

a push so now we need to demonstrate

642:15

what is a pull

642:20

and so in order to do that a pull

642:22

request we need to actually make changes

642:24

on our remote repository so that on

642:27

GitHub and then pull it into our local

642:31

repository so here's what we can do for

642:33

that I'm going to just go in and we

642:35

created this read me. markdown file upon

642:39

creation because we selected that

642:40

checkbox you can actually come in here

642:43

and edit this read me by clicking the

642:45

edit file button and and I'm just going

642:47

to come in here and I'm just going to

642:48

say hey I added this on github.com

642:52

adding it in the bottom now we're going

642:54

to go into markdown formats and stuff as

642:56

you can see we have this hashtag here

642:58

we're going to go all that in the next

643:00

lesson but anyway I made this changes to

643:02

here so we need to like we did on our

643:04

local repository and making a change we

643:06

need to commit those changes here and

643:09

conveniently it just gives us a commit

643:11

message of update read me confirm the

643:13

correct email and it conects directly to

643:15

the main branch we're just staying on

643:17

that Branch we're not shifting for this

643:18

course at all from there I'm going to

643:20

commit changes so now if I go back into

643:23

the project itself scroll on down to see

643:25

the read me I can see that I have I

643:27

added this on GitHub whereas on my local

643:30

machine if I go into look at the readme

643:34

markdown it doesn't have that addition

643:36

that I added to the readme file so we

643:39

need to pull those changes going back to

643:42

the GitHub desktop app I'm going to come

643:44

up here and you notice that it says

643:46

fetch or this isn't going to do anything

643:48

this is just going to fetch origin

643:49

basically the main branch and Pull It in

643:52

this isn't going to make any changes to

643:53

your file it's just going to update it

643:55

of what's on GitHub and we can see based

643:58

on this that we have basically one

644:01

change here by this one and this down

644:04

Mark and so in order to get these

644:06

changes we need to pull the origin pull

644:08

it and so I'm just going to click it to

644:10

pull and now when we go into the history

644:13

we now have this new one of update read

644:15

me we can see that this readme has this

644:18

addition because it's in green of I edit

644:20

this on github.com and then inspecting

644:22

this in the readme itself it now updated

644:25

to say hey I added this on github.com so

644:27

bam we just demonstrated how to push and

644:30

also pull from our local repository and

644:33

machine to our remote

644:35

repository so now that we have GitHub

644:38

and git all set up we now need to get in

644:40

to actually building out those readms

644:43

and explaining what we did in our

644:45

project and demonstrating those skills

644:47

that we gained in this course so that's

644:49

what we'll be doing in the next lesson

644:51

if you're getting stuck at any point

644:53

during the way I highly recommend that

644:54

you take use of something like chat gbt

644:57

or even gemini or whatnot and actually

644:59

paste in your error code and it will

645:02

help you with troubleshooting it it's a

645:04

lot quicker than posting a comment in

645:06

here saying that you had an issue all

645:08

right with that see you in the next one

645:09

we're getting into the Remy see you

645:14

there welcome to the last video in this

645:18

course and in this we're going to be

645:19

going over how we're going to actually

645:22

document all the different work that you

645:24

did for project one and for project two

645:27

we're going to putting this into our

645:29

markdown file or our read me and then

645:32

from there getting it onto GitHub and

645:34

then finally going through how to share

645:36

it on LinkedIn so right now navigating

645:38

to our GitHub repo with our project in

645:41

it you should have at least two folders

645:44

in there one for your project One

645:45

dashboard and one for your project 2 if

645:47

you have your other folders for all the

645:49

work that you did for all the other

645:50

lessons in this course that's awesome

645:53

too but not required mainly just have

645:55

your project work in there anyway we

645:57

have this read me for the entire project

646:00

itself and right now it's pretty Bare

646:02

Bones and if we navigate into that

646:04

project One dashboard right now you

646:05

should have only have a file in there

646:08

specifically that Excel file but we need

646:10

also a readme in here as well so we can

646:12

description add a description of what we

646:14

did in that dashboard similarly project

646:17

2 doesn't have a read me as

646:20

well now we have demonstrated in that

646:23

last lesson how we can actually go into

646:25

something like the readme and then from

646:28

there edit it inside of your web browser

646:30

by just clicking this edit this file

646:33

icon it shows not only the edits for you

646:36

to actually go through and maybe type

646:38

something but also the preview itself

646:41

itself of what the file is going to look

646:42

like don't worry we're going to be going

646:44

over markdown syntax in a little bit but

646:46

anyway that's how we're going to be

646:47

doing all these different changes to the

646:50

files for this I'm not going to do these

646:52

changes I'm actually going to cancel

646:53

these changes now an alternate option to

646:56

making edits to something like a readme

646:58

is using a text editer or IDE integrated

647:02

development environment such as

647:04

something as Visual Studio code which is

647:06

completely free and is I have it

647:08

launched here in my app um is an app

647:10

that I use in order to edit and manage

647:13

my different files I can also go through

647:16

if I'm editing the read me itself I can

647:18

type inside of here and edit it but also

647:21

during that I can actually go in and

647:24

view what's going on with the actual

647:27

read me itself off to the side while I'm

647:30

typing here in this other window anyway

647:32

I just want to make you aware of this

647:33

that is an option for you to go through

647:36

but it does take some experience with

647:38

knowing how to use vs code setting this

647:40

all up so based on the complexity we've

647:43

already built up already we're going to

647:44

stick to just editing our readms inside

647:47

of github.com

647:50

so before we get into building our

647:53

project readms we need to understand

647:55

some syntax here specifically if you

647:58

notice this Excel project analytics is

648:00

capitalized and everything else is

648:02

lowercased and if we actually go in and

648:04

edit the file we can see that we have

648:06

this hashtag at the front which

648:08

translates this into a heading so they

648:11

have special characters that you can

648:13

actually use in front or around text to

648:15

manipulate text

648:17

and the team that created markdown

648:18

conveniently created this cheat sheet

648:20

which I'll link here and it shows all

648:23

the different methods that you can use

648:26

to actually manipulate and make

648:28

different things happen inside your

648:30

markdown file so let's actually look at

648:32

a few here I have a heading one heading

648:34

two and heading three denoted by how

648:36

many hashtags and a space and then if I

648:38

preview this heading one heading two and

648:40

heading three next we can either bold or

648:43

italicize text by surrounding it either

648:46

double asteris or single asteris and the

648:50

final results right here is bold and

648:52

italicized notice how the Bold text and

648:54

italicize are on the same line it's

648:57

important that after you go to a new

648:59

line you actually put two spaces in

649:02

there now that I have that in there it

649:05

will actually shift it to the next line

649:07

we can also do things like an ordered

649:09

list or an unordered list which would be

649:11

like bullet points and it conveniently

649:13

indents that and makes it look a lot

649:15

nicer we can o surround something by a

649:17

back tick which is located up at the top

649:20

of your keyboard or you could do triple

649:22

back ticks at the top and bottom for if

649:25

you have multiple lines of code and if

649:27

we actually go to preview this we can

649:29

see that the single line of code was

649:31

just surrounded whereas a multiline

649:33

creates this entire coding block the

649:35

final two worth mentioning are links and

649:38

also images for the link for the text

649:41

that you wanted to appear for the link

649:42

you'll put in square brackets and then

649:44

for the hyperlink itself you're going to

649:46

put that inside a parentheses right next

649:48

to it and then actually changing this to

649:50

a real world example of something like

649:52

google.com if I go to preview and then I

649:55

click this link it's going to ask me if

649:57

I want to leave site and go to Google

649:59

I'm not going to do it because it's

650:00

going to mess up all my changes but you

650:02

get the point for images is very similar

650:05

but the text you provide in the square

650:06

brackets is just your alternate text so

650:08

whenever you scroll over it what the

650:09

text is displays and then from there is

650:11

the actual image location however this

650:14

isn't an actual image location so I have

650:17

this eror message that goes on with this

650:19

alt text hence this broken file you're

650:22

going to notice that if any of your

650:23

files for your images are broken anyway

650:26

github.com actually makes it pretty easy

650:28

to get images in in this case I have a

650:30

gif of the dashboard you could also use

650:32

an image file but all I have to do is

650:34

take it and drag it into here and if you

650:36

notice it automatically formatted it

650:38

with alt text and then the actual link

650:40

location itself so saving the file

650:42

itself and it puts that exclamation

650:44

point at the front signifying that it's

650:46

an image or in this case GIF if I go to

650:48

preview scrolling down we can see that

650:51

we have our image once again you need to

650:54

put spaces after that other one to make

650:56

sure that you're not having it all in

650:57

the same line but you get the

651:00

point anyway let's actually get into

651:03

creating this read me that's on the

651:05

homepage if you will of our actual

651:08

project and the main point of this one

651:09

is I want people to be navigated to the

651:12

appropriate project depending on what

651:14

they're looking for so I went ahead and

651:15

put in some text already for how I want

651:17

to break this down I'll break uh I'll

651:19

shift over to preview and I'm going have

651:22

a title such as my excel. analytics

651:24

projects from there we're going to have

651:26

the salary dashboard project and the

651:27

salary analysis right now the image that

651:30

I have for the dashboard is in the wrong

651:33

location actually shift that up now I

651:35

went ahead and added the images also for

651:37

our salary analysis while cleaning up

651:39

where the salary dashboard is which I

651:41

included only just two graphs here but I

651:43

just want to give a sneak peek of what's

651:45

going to be inside of those other readms

651:47

that were about to build out now you may

651:49

be wondering how the heck do I get

651:51

screenshots of graphs in my different

651:54

dashboard well depending on if you're

651:55

using Mac or Windows they have software

651:58

installed already and so these shortcuts

652:01

should work for you in order to perform

652:03

your appropriate screen capture I

652:06

primarily use on a Mac command Shift 4

652:08

to select a certain area and it allows

652:11

me to basically just hover over

652:13

something and snapshot it this same

652:15

thing can be done on a window Windows

652:16

machine you're just going to press

652:18

Windows shift plus s so I went through

652:21

also and just added a quick description

652:23

to each section I'm go into preview

652:24

because it's a little bit easier to read

652:25

there anyway underneath this I just

652:27

detail hey this contains all my Excel

652:30

files to follow along in my case my free

652:32

course of Excel for data analytics I

652:34

would word it differently for you of

652:36

that you're actually providing all your

652:38

different Project work in this

652:39

repository additionally I provide a

652:41

short description for the first project

652:44

and then also a short description for

652:45

the second project make sure in this

652:47

case you actually are putting spaces

652:50

after those lines so you don't have

652:52

those images overlay on top of it now

652:54

the last thing I would do as you see

652:56

here I link to my course but I think

652:58

more importantly what you need to do is

653:00

actually link to the appropriate files

653:03

within this repository so people can

653:05

quickly get to the salary dashboard or

653:07

the salary analysis and so I'm going to

653:09

add this link of connecting to that

653:12

appropriate project by first adding this

653:14

text of check out my work here

653:17

and then inside parentheses I'm going to

653:19

list the folder of project One dashboard

653:24

you have to make sure you spell it

653:25

exactly like the folder that is inside

653:28

of your repository or the Link's not

653:30

going to work I'm going to do the same

653:31

with the project two dashboard as well

653:33

and going to preview it I can see it's

653:36

all there I probably want some spaces in

653:38

between

653:40

this and so just put an extra enter in

653:42

there okay that's good enough I'm going

653:44

to get into committing the changes this

653:46

is update my readme that sounds good I'm

653:49

going to commit them so now on our home

653:52

folder of our repository of excel

653:54

project. analytics scrolling down I have

653:56

my read me here it tells me about it and

653:58

then for the salary dashboard it says

653:59

hey check out my work here when I click

654:01

on it it navigates me into this folder

654:04

for the salary dashboard which you need

654:05

to now create a readme 4 also it's just

654:08

good practice to make sure that you

654:09

check to make sure that other link works

654:11

as well and in this case it didn't it's

654:13

a good thing we checked it I had project

654:15

2 dashboard and instead it was actually

654:17

project 2 analysis I'm going to commit

654:20

changes and then now when I actually try

654:22

it out bam navigates me to the right

654:24

location so now you have now the basics

654:26

to go through you understand markdown

654:28

enough to edit it I'm going to walk

654:30

through how I built out the project one

654:33

read me and also the project 2 read me

654:35

so that way you have some understanding

654:37

of what you should do going forward with

654:39

the project one I recommend including a

654:41

picture of the dashboard to start and

654:43

then a brief intro detailing why you

654:45

wanted to do this project underneath

654:47

this make sure you include a link to the

654:49

file itself which is conveniently right

654:51

here and then inside of here detailing

654:54

the different skills that you use with

654:56

building this is really important for

654:58

job Seekers that way if a recruiter

655:01

comes and looks at this they see what

655:02

the skills are you used in this and then

655:04

from there I talk about the data set

655:06

itself talking about what we were trying

655:08

to get or extract out of the data so

655:11

basically all the foundation they need

655:12

in the introduction portion this portion

655:15

I recommend keep being the similar

655:17

format the next portion you can feel

655:19

free to go about however you want

655:21

specifically I go into the dashboard

655:23

build breaking it down into three main

655:25

areas of focus on first is the charts

655:29

itself I highlight the different median

655:31

salaries all of the different job titles

655:33

themselves I go into some insights from

655:35

that I also talk about the country map

655:38

and the insights from this as well next

655:40

after charts I move into functions and

655:42

formulas detailing one of the key

655:44

functions that we used using median and

655:46

then an if statement in order to build

655:48

out an array formula so not only

655:50

breaking it down but also explaining

655:52

what insights we're able to get with

655:54

this formula and then the third skill I

655:56

talk about is data validation talking

655:59

about why it's used a gif of How It's

656:02

actually applicable or how it's actually

656:04

visually seen in Excel and then finally

656:06

I just wrap it up with a conclusion so

656:09

to recap for the first project you need

656:11

an intro statement describing what we're

656:13

doing and why you did it and what skills

656:15

you used then then from there on the

656:16

build itself explaining what you

656:19

actually built how you use those skills

656:21

and what insights you got out of it and

656:23

then finally wrap it up with a

656:24

conclusion for the second project mine

656:27

is very similar formatted in that I have

656:29

an introduction Excel skills used the

656:32

data set and then since this one was

656:34

primarily focused on analysis I included

656:37

the four questions that we went through

656:39

and actually answered for our analysis

656:42

so then with the template of these four

656:44

questions I broke each one of those down

656:47

with those questions primarily focusing

656:49

on one what skill did I use to help

656:52

answer that question and then two what

656:56

is the analysis insights I got out of

656:59

answering that question I repeat the

657:01

same thing for the second question

657:02

specifying the skills that we use for

657:04

this and then the analysis or what

657:06

insights we got out of it after going

657:08

through questions three and four we then

657:10

get to our final thing of a conclusion

657:12

of what you actually learn and extracted

657:14

from insights for this so it's really

657:16

good to put all this stuff in it I

657:18

wouldn't be overwhelmed and think you

657:20

need to include everything in it think

657:22

about a job recruiter themselves they

657:23

don't have a lot of time so keeping it

657:26

as short and to the point as possible is

657:28

going to be best for

657:31

you once you're done actually gone

657:34

through and built out your repo with all

657:36

its Associated read me it's time to get

657:37

into actually sharing this on social

657:40

media via LinkedIn I recommend the same

657:42

approach that we used back in Project

657:44

one of listing this down in your project

657:47

section by going through and actually

657:49

clicking the add icon and adding the

657:51

projects if you did go through and

657:53

actually add that salary dashboard

657:54

already I would just focus this one on

657:57

the salary analysis so I'd put in

657:59

something like a name of the data

658:01

science job analysis a description add

658:03

any appropriate skills there's a ton of

658:05

different skills you actually select for

658:07

what you use I would focus on primarily

658:09

these of Microsoft Excel power query

658:12

data modeling ETL and pivot tables for

658:15

the media in this case I would include a

658:18

link to your repo and paste it on into

658:21

here and click add it will then provide

658:24

this snapshot thumbnail of what's going

658:25

on here and a title I like it all I'll

658:28

click apply now if you recall back from

658:30

that first project we tried to provide

658:32

the link of that one drive link for

658:34

Excel and it didn't work so if you have

658:36

that project on LinkedIn I would go

658:38

through and also attach this link as

658:40

well to that so that way they know how

658:42

to navigate to it finally select your

658:44

start and stop date if you have any

658:46

contributors are associated with I don't

658:47

have in this case and then from there

658:48

save it the last thing I recommend doing

658:51

is making a post telling others about

658:53

your project so they can come in and see

658:54

it in it I would definitely include

658:56

something like a link and feel free to

658:58

tag Kelly or myself in it I love

659:00

checking out your projects and seeing

659:01

the different work that you've done for

659:03

it so once again congratulations for

659:06

finishing this course been nothing short

659:08

of your hard work Excel was the first

659:10

skill or main skill that I learned in

659:12

helping me land my first data analytics

659:15

opportunity so I feel the same can go

659:17

for you as well now after you taking a

659:19

short break and you're ready to get back

659:20

into learning more skills I do have a

659:23

squel course that I recommend you taking

659:26

as you've learned from analyzing this

659:27

data Excel and SQL are two of the most

659:30

top skills of data analyst so it pays to

659:33

know it and you can basically learn it

659:35

in a weekend all right with that I'll

659:37

see you then either in the next video or

659:39

in the next course see you there

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.