Communicating

CMPT 353

Communicating

Clever data analysis isn't useful if…

  1. it doesn't answer useful questions, or
  2. nobody understands the results.

Communicating about both requirements and results is a big part of the data science job.

Communicating

From some data science job ads, requirements about asking the right questions:

Energy domain expertise is highly desired; a passion for the energy domain is essential. EnerNOC
Liaising with credit risk strategy managers, leaders, and other stakeholders to identify requirements by capturing the distinct problems and the expected outcomes which impact critical business processes and/or decisions BMO Financial Group

Communicating

Requirements about asking the right questions:

… applying advanced analytics to tackle complex and non-routine business problems to drive to actionable business insights. REHUMAN Inc
Gather and refine specifications and requirements based on business needs. MasterCard

Communicating

Requirements about communicating results:

Engage with stakeholders to ensure that data insights are effectively communicated through the most appropriate data visualization and navigation tools. REHUMAN Inc
We're looking for people who are constantly trying to improve not only their technical skills but their communication and interpersonal skills as well Best Buy Canada

Communicating

Requirements about communicating results:

• Provide timely, relevant, coherent results (reports, data analyses, etc.) designed to meet the client's specific needs, and tailored to specific audiences;
Transfer technology and knowledge through reports, handbooks, workshops, and presentations to members, clients and general conferences. FPInnovations

Asking the Right Question

It's hard to say much here.

Real data science questions are going to be about whatever field/​industry they come from. You may be asked about finance, or marketing campaigns, or customer behaviour, or forestry site productivity, or …

Asking the Right Question

You have to be able to communicate with people who understand the problem at hand, and make sure you know what is needed. Ask questions as necessary.

Nobody is going to expect the co-op student or new hire to be a domain expert.

Remember: the goal is to get the information that is needed. That may be only loosely related to what was requested.

Asking the Right Question

My experience: the question people ask always sounds perfectly reasonable.

The question they meant is sometimes trivial, sometimes reasonable, or sometimes impossible.

It's best to find out which early.

Communicating Results

Communicating data science results is a lot like communicating in general.

Hopefully your W courses (CMPT 376 or similar) point you in the right direction.

Communicating Results

When explaining your results, make sure you are clear and honest about what you found.

Resist the urge to make your results sound cooler than they actually are. If the results aren't very definitive, then say so.

Communicating Results

Also don't be afraid of limitations of your analysis.

If there isn't enough data, or the right data, or a technique to find the answer you're seeking, then you should be able to explain that clearly.

Communicating Results

Being honest might include technical details: assumptions about data, \(p\)-values, possible artifacts of the method.

You should probably address those (depending on the context and audience). Do your best to explain them in a way your audience can understand.

Visualizing Data

Of course, visualizations (charts, graphs, etc) are a frequently-useful way to present data.

We have used matplotlib several times in the course. Maybe also look at Seaborn. For more, have a look at the Visualization with Matplotlib chapter in the Python Data Science Handbook.

Visualizing Data

When creating visualizations, make sure you display your data so that the reality of the data is easy to see. The goal should be to help readers understand what is happening.

Don't mislead with visual junk. * * *

Visualizing Data

Choose a visualization that makes the interesting differences clear. *

Visualizing Data

Make sure you label what's going on and make sure formatting of the data makes it readable.

Visualizing Data

The same plot after:

import seaborn
seaborn.set()

More Resources