tisdag 23 oktober 2018

Simulering av hur Allsvenskan slutar 2018

måndag 22 oktober 2018

How SQL++ Makes JSON More Queryable

(Wright-Studio/Shutterstock)
Datanami: How SQL++ Makes JSON More Queryable
For more than 40 years, SQL has provided a standard way to query structured data. However, much of the data being generated and stored today exists in semi-structured formats, like JSON, which doesn’t “speak SQL.” But now, thanks to an extension of the SQL language called SQL++, developers who are fluent in SQL have an easier path to incorporating semi-structured data into their queries and analytics.

Read more....

I P4 igen, om statistik och våra felkalibrerade hjärnor

Olle Häggström
Häggström hävdar:
Idag den 17 oktober har jag varit tillbaka hos Stefan Livh i P4 Göteborg igen. Diskussionen denna gång handlade om våra felkalibrerade hjärnor, om behovet av rigorösa statistiska metoder inom vetenskapen, om avståndet mellan Mölnlycke och Åmål, och lite grand på slutet om den kritik jag riktat här på bloggen mot Göran Lambertz angående dennes hemmagjorda sannolikhetskalkyl. Cirka 1:08:53 in i sändningen börjar vårt samtal, och det håller på (inklusive diverse avbrott för musik och nyheter) fram till cirka 1:42:15.

Statistiskt trolleri i världsklass – Ekonomistas

Ekonomistas
Det sägs ibland att allt går att bevisa med statistik. Talesättet syftar nog främst på att det är lätt att luras med statistik. Men faktum är att statistisk analys i varierande grad alltid vilar på antaganden. I många fall har vi goda skäl att göra dessa antaganden, medan andra fall är mer kontroversiella och kan ge upphov till långa vetenskapliga debatter, t.ex. när det gäller bortfall, mätfel och vilka variabler man kan bortse från i analysen. En artikel som är under utgivning i Journal of Political Economy visar dock att till och med den mest triviala jämförelse vilar på antaganden som inte är självklara.

Läs mer...

söndag 21 oktober 2018

Paul Biemer advertises his #BigSurv18 talk

lördag 20 oktober 2018

De svenska registerdatabasernas sorgebarn: Dokumentationen

Ekonomistas
Svenska registerdata används i allt högre grad inom forskning och utvärdering vilket har resulterat i en mängd rön och lärdomar inom särskilt medicin och samhällsvetenskap. Men ett problem som sällan diskuteras är att de flesta av dessa registerdatabaser är bristfälligt dokumenterade. Antalet variabler är mycket stort (tusentals) och har i de flesta fall förändrats över tid. Statistikmyndigheterna i Danmark (DST) och i viss mån Norge (SSB) har upprättat samordnade system för registerdokumentation, men tyvärr har Sverige och SCB halkat efter. [Läs vidare]

Läs även artikeln i Qvintensen!

Data science for official statistics: the story so far


RoyalStatSoc
With the data revolution occurring all around us, official statistics have to work harder to measure what is going on in society and the economy. We have to ensure that decisions are taken with the right evidence at the right time. Whilst there is great potential with more data, statistics producers have to be increasingly efficient and do more with less. ONS is striving to lead the way and be at the forefront of this revolution.

torsdag 18 oktober 2018

onsdag 17 oktober 2018

Do more with R: drag-and-drop ggplot

InfoWorld


Link to video

A new R package creates a simple graphical user interface for ggplot2—and it generates R code for the visualization you create.

Some R users become leery of graphical user interfaces. Pointing and clicking and dragging may be convenient, but it can be harder to save, check, or rerun an analysis.

But I think even most hardcore command-line junkies would agree that a drag-and-drop interface can be helpful for some exploratory data visualization.

tisdag 16 oktober 2018

GEOSTAT 2 - A point-based foundation for statistics

GEOSTAT 2 | EFGS

The aim of the GEOSTAT 2 (2015-2017) has been to foster a better integration of statistics and geospatial information in order for the statistical community to provide more qualified descriptions and analyses of society, economy and environment. The GEOSTAT 2 was a two year ESSnet grant project building on the results from its predecessors GEOSTAT 1A and 1B.

fredag 12 oktober 2018

Statistics Canada promises more detailed portrait of Canadians with fewer surveys

The Globe and Mail
Canadians are increasingly shunning phone surveys, but they could still be providing Statistics Canada with valuable data each time they flush the toilet or flash their debit card.

The national statistics agency laid out an ambitious plan Thursday to overhaul the way it collects and reports on issues ranging from cannabis and opioid use to market-moving information on unemployment and economic growth.

torsdag 11 oktober 2018

Hur påverkar opinionsmätningar väljarbeteende? En reflektion över valet 2018 med fokus på partierna runt spärren

Qvintensen webbtidskrift - av Annnika Fredén
I årets val landade KD på 6,3 procent och MP på 4,4 procent. Hur kunde det gå så bra för KD och vad var det som gjorde att även MP fick marginalerna med sig?

I valet 2018 ser opinionsmätningarna ut att ha hjälpt till koordinera väljarnas val av parti. Mätningarna låg relativt nära det slutliga valresultatet och det var en ovanligt låg andel ”bortkastade” röster: de flesta röstade på ett parti som kom in i riksdagen. I denna text tittar jag närmare på de två partierna närmast spärren: Kristdemokraterna och Miljöpartiet, för att med hjälp av min tidigare forskning förklara deras balansgång runt spärren.

Visualize and communicate your data analysis more effectively

jmp - Seminars:

Join us online on 18 October!

We all use graphs to communicate data. But poorly designed, misleading graphs can do more harm than good. Data visualization – when done right – can transform not only the way you present results to colleagues and stakeholders in your organization, but also the way you explore and analyze the data itself. How can you be sure your visualizations add real value – not just noise?

In this seminar, data visualization guru Kaiser Fung will demonstrate how to:

Make effective visual displays.
Find a story in a raw data set.
Turn chart-making from a job requirement to a professional advantage.

Read more....

Great R packages for data import, wrangling and visualization


See how the tidyr R package’s gather and spread functions work. Plus a bonus look at labeling in ggplot2

Here are my go-to R packages -- in a handy searchable table.

One of the great things about R is the thousands of packages users have written to solve specific problems in various disciplines -- analyzing everything from weather or financial data to the human genome -- not to mention analyzing computer security-breach data.[ Need to learn R or brush up on basics? Download our free Beginner's Guide to R or the Advanced Beginner's Guide to R ]

Some tasks are common to almost all users, though, regardless of subject area: data import, data wrangling and data visualization. The table below show my favorite go-to packages for one of these three tasks (plus a few miscellaneous ones tossed in). The package names in the table are clickable if you want more information. To find out more about a package once you've installed it, type help(package = "packagename") in your R console (of course substituting the actual package name).

onsdag 10 oktober 2018

Nigeria eyes 10% share of $3.3bn global data analytics market

Businessamlive
Yemi Kale, statistician-general of the federation and CEO of National Bureau of Statistics (NBS) in a paper titled “Big data economy: Driving the economy through data science”, noted the huge data demand in the country, which is being fuelled by growing insistence on accountability and good governance by citizens, as well as the desire by governments at all levels to demonstrate progress and democratic dividends in various sectors.

Read more....

Federal government urged to reintroduce statistics in secondary schools

TODAY.NG
The Nigerian Statistical Association (NSA) has urged the Federal Government to reintroduce statistics as a core subject in secondary school curriculum.

The association, in its communiqué at the end of the 42nd Annual Conference in Awka, Anambra State, also recommended that government should create enabling environment to encourage private sector participation to open up the oil and gas sector and expand investment opportunity space across the entire value chain.

NSA also urged state governments yet to enact their statistics laws to do so while enjoining other states to implement the Statistical Master Plan to the letter.

'If none of our ideas fail, we are not being ambitious enough' - inside government's data science revolution

Credit: ONS/Open Government Licence 3.0
PublicTechnology.net
Tom Smith, managing director of the ONS Data Science Campus, tells PublicTechnology how and why he wants to 'move the needle' for the use of data across the public sector

On the website of Office for National Statistics’ Data Science Campus, the campus’s managing director Tom Smith is described as a “lifelong data addict”.

“Tom has more than 20 years’ experience using data,” it adds.

Read more....

Economics Nobel laureate Paul Romer is a Python programming convert

Quartz:
Instead of using Mathematica, Romer discovered that he could use a Jupyter notebook for sharing his research. Jupyter notebooks are web applications that allow programmers and researchers to share documents that include code, charts, equations, and data. Jupyter notebooks allow for code written in dozens of programming languages. For his research, Romer used Python—the most popular language for data science and statistics.

Statistics and data science degrees: Overhyped or the real deal?

The Conversation:

“Data science” is hot right now. The number of undergraduate degrees in statistics has tripled in the past decade, and as a statistics professor, I can tell you that it isn’t because freshmen love statistics.

Way back in 2009, economist Hal Varian of Google dubbed statistician the “next sexy job.” Since then, statistician, data scientist and actuary have topped various “best jobs” lists. Not to mention the enthusiastic press coverage of industry applications: Machine learning! Big data! AI! Deep learning!

måndag 8 oktober 2018

The Difference Between Computer Science and Data Science

The Difference Between Computer Science and Data Science

Many students are confused about whether data science is a part of the computer science. In fact, data science belongs to computer science yet remains different from computer science. Both terms have similarity, but there is a significant difference between the two. Computer science has various small domains, such as artificial intelligence, analytics, programming, natural language processing, machine learning, web development and a lot more. Data science is also a part of computer science but it requires a lot more knowledge of maths and statistics.

Read more....

PhD student in Statistics and Machine Learning within WASP Graduate School Linkoping

Vacancies - Linköping University
Skilled and committed employees are a crucial factor in the success of Linköping University. And we need more of them. Our core expertise comes from teachers and researchers, but a successful university requires experienced and motivated employees in many fields. Everyone is important. We need to recruit many new employees due to retirement among our current staff and an expansion in our research activity. We need you here. We look forward to receiving your application!

Read more....

söndag 7 oktober 2018

Agenda - Ikväll 21.15 | SVT Play

Agenda - Ikväll 21.15 | SVT Play

30 minuter in i Agenda: Ett år efter sin död är Hans Rosling större än någonsin. Hans sista bok Factfulness prisas av Bill Gates och ska delas ut till Sveriges gymnasielever. Men ger den en ensidigt optimistisk syn på världens problem? Diskussion mellan professor Christian Berggren och Ola Rosling, medförfattare till boken. Programledare: Camilla Kvartoft.

https://www.svtplay.se/video/19236143/agenda/agenda-7-okt-21-15-1?start=auto

You Need Statistics to Make Wine

Stats With Cats Blog

The American Statistical Association has identified 146 college majors that require statistics to complete a degree.

You probably wouldn’t be surprised that statistics is required for degrees in mathematics, engineering, physics, astronomy, chemistry, meteorology, and even biology and geology.

Nya hedersdoktorer föreläste om betydelsen av en sann världsbild

Högskolan i Skövde


https://youtu.be/tuBQsyelamM

Läs mer....

lördag 6 oktober 2018

Significance Magazine

“It should never be true, though it is still often said that the conclusions are no more accurate than the data on which they are based”

Error Statistics Philosophy
My new book, Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars,” you might have discovered, includes Souvenirs throughout (A-Z). But there are some highlights within sections that might be missed in the excerpts I’m posting. One such “keepsake” is a quote from Fisher at the very end of Section 2.1.

Read more...

fredag 5 oktober 2018

Metodstatistiker till SCB

Statistiska centralbyrån:
Avdelningen för process- och metodutveckling söker metodstatistiker med huvudsaklig placering i Stockholm. Avdelningen ansvarar för att rätt kompetens avseende metoder, kvalitet, dokumentation och mätteknik finns och utvecklas till stöd för statistikproduktionen och för att driva utveckling och förbättring av våra produktionsprocesser.

Statistical Inference for Analysis of Massive Health Data: Challenges and Opportunities


https://youtu.be/cFmthGjjbKc

Speaker: Professor Xihong Lin - Chair, Department of Biostatistics, Harvard T.H. Chan School of Public Health

Massive data from genome, exposome, and phenome are becoming available at an increasing rate with no apparent end in sight. Examples include Whole Genome Sequencing data, large-scale remote-sensing satellite air pollution data, digital phenotyping, and Electronic Medical Records. The emerging field of Health Data Science presents statisticians with many exciting research and training opportunities and challenges. Success in health data science requires strong statistical inference integrated with computer science, information science and domain science. Examples include signal detection, network analysis, integrative analysis of different types and sources of data, and incorporation of domain knowledge in health data science method development. In this talk, I discuss some of such challenges and opportunities, and illustrate them using high-dimensional testing of dense and sparse signals for whole genome sequencing analysis, integrative analysis of different types and sources of data using causal mediation analysis, and analysis of multiple phentoypes (pleiotropy) using biobanks and Electronic Medical Records (EMRs).

Highest Booming Data Science Platform Market growing at a CAGR of 35.8% by 2023

openPR
In business functions, the logistics segment holds the largest market share and is gaining significant importance among corporates & enterprises. In the logistics industry, customer satisfaction, global expansion, strong delivery & transport network, and presence of wide global/local presence are the most essential factors. Data scientists apply advanced mathematics and statistics to address numerous business queries that delivers insights to management, thereby maximizing the return on assets and high Returns on Investments (RoI).