Blog

You can find tangible know-how, tips & tricks and the point of view of our experts here in our blog posts

Nahaufnahme von Händen auf einer Laptop-Tastatur
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Development Of A Powerful Data Science Team
Development Of A Powerful Data Science Team

Development Of A Powerful Data Science Team

Data science has undergone an increasing professionalization and standardization during recent years. The frequently intrinsically motivated data tinkerer and diddler, who fills the niche "analysis" in his business with very high company-internal data and process know-how, is reaching his limits.

Increasing demands, especially in the course of a stronger customer focus across all industries, force businesses to professionalize the structures in the area "data science": This includes knowledge, available data sources and their preparation and data science products already used in the business.

Read more
From SAS to R and back: Transfering SAS data Into an R System
From SAS to R and back: Transfering SAS data Into an R System

From SAS to R and back: Transfering SAS data Into an R System

SAS and R are topics which are very closely related: Both are popular tools for people like us who want to solve problems from the environment of statistic and machine learning on (more or less) large data volumes. Despite this apparent proximity, there are few touchpoints between the communities and only few persons work with both tools. As passionated `outside the box´ thinkers, we regret that and want to start a mini-series by means of this blog article in which we deal with topics which connect the both worlds, in loose order. For this first blog article, we will deal with the possibilities to exchange data between the systems. As there is a high number of ways, this article is limited to the transfer of SAS to R; the opposite direction will follow in a later article.

Read more
Best Practice: Campaign Implementation
Best Practice: Campaign Implementation

Best Practice: Campaign Implementation

In order to successfully implement a campaign, it is important to be able to rely to a closed campaign planning process within the business. In case there is no such defined and uniform process, valuable potential is wasted and there is a risk that there will not be successes in the long term.

Read more
Boosting For The Naive Bayes Classifier
Boosting For The Naive Bayes Classifier

Boosting For The Naive Bayes Classifier

There are many areas in which neuroscience and machine learning overlap. One of these is the combining of learning during several learning episodes with small success in order to eventually use a merged, stronger, learned model for a particular task. In machine learning, this process is referred to as "boosting". The development of solutions of this kind is a very interesting topic, in particular in the IT industry; thus, a short introduction to machine learning is provided below which presents the basic ideas and the application of the naïve Bayes classifier in R.

Read more
A Basket Full of Snakes: Python Modules for Data Science
A Basket Full of Snakes: Python Modules for Data Science

A Basket Full of Snakes: Python Modules for Data Science

Anyone who knows my former blogs knows that I am a big fan of both R and Python in daily work.

As powerful as R is in terms of functionalities for data analysis and modeling, as quickly is the motivation subdued in case of "number crunching" when RAM runs at maximum.

In this context, a nice server installation with a lot of metal (e.g. 96Gig-RAM) works wonders.

As this option is not always available, I have made a virtue of necessity and turned towards the more performant alternative, namely the Python based R alternatives, especially since I have been using Python for ETLs and data preparation for a long time.

Read more
Time Series Analysis Made Easy – Completely Without Analysis Tool
Time Series Analysis Made Easy – Completely Without Analysis Tool

Time Series Analysis Made Easy – Completely Without Analysis Tool

Starting Situation

The controlling division of a telecommunications business is to be supported regarding the forecasting of the monthly development of gross adds figures. "Gross adds" is the key figure which reports the gross new customer growth within a defined period, where the number of lost customers is not taken into account. The key figure "gross adds" is primarily used in the telecommunications industry and reflects the number of newly concluded contracts (postpaid and prepaid).

Read more
Howto: Connecting Cells by Means Of arcplan 8.6
Howto: Connecting Cells by Means Of arcplan 8.6

Howto: Connecting Cells by Means Of arcplan 8.6

arcplan facilitates the creation of standardized reports, which support (and thus make more efficient) the daily work of the employees of a business. In particular, this is the case if reports present the included data in a meaningful, concise and user-friendly manner. Large volumes of information are thus often structured and presented in the form of tables. In order to illustrate relations and/or hierarchies of data within the table and avoid redundancies, it is required to choose columns and line headings which, in addition, are supposed to be placed expediently.

Read more
High Performance (Mental) Exercise With R
High Performance (Mental) Exercise With R

High Performance (Mental) Exercise With R

This article deals with the following three questions on a high level and very briefly:

  • What does a data-driven person think when he hears contentions?
  • Which tool is more practical for data analyses: R, Python, Java, MATLAB?
  • Can sporting disciplines be the next application area for data analyses and machine learning
Read more
Howto: Splitting Files With Standard Python Scripts
Howto: Splitting Files With Standard Python Scripts

Howto: Splitting Files With Standard Python Scripts

Ready-Made Data Sets Which Explode the Limits

I am frequently confronted with raw data that is provided to me for analysis and which, when uncompressed, can easily encompass files of half a gigabyte or more. Starting from one gigabyte and over, the desktop-supported statistics tools slowly become strained. There are, of course, tool options for only selecting part of the columns, or only loading the first 10,000 lines, etc.

But what should you do when you only want to take a random sample from the data provided? You should never rely on the file being randomly sorted. It may already have gained systematic sequence effects due to processes in the database export. It also may be the case, that you only want to analyse a tenth of a grouping, such as the purchases made by every tenth customer. To this end, the complete file has to be read as otherwise it is impossible to ensure that all of the purchases of the filtered customers are taken into account.

Read more
Uplift Modelling as Addition to Classic Response Modelling
Uplift Modelling as Addition to Classic Response Modelling

Uplift Modelling as Addition to Classic Response Modelling

Uplift modelling can support campaign managers in managing and planning campaigns as it supplements the classic response model of campaign scoring.

Uplift modelling is based on the principal idea that campaign responders are grouped in two categories: those who would have reacted even without the campaign and those who would not have responded without the campaign. Unlike classic scoring, which equally aims at both groups, uplift scoring tries to exclusively isolate the second group and, wherever possible, ignore the first. For this purpose, the response information from the control group is used, which remains unused in classic campaign scoring

Read more
Instructions: HICHERT (IBCS) Out Of The Box
Instructions: HICHERT (IBCS) Out Of The Box

Instructions: HICHERT (IBCS) Out Of The Box

arcplan is the first software tool for business intelligence (BI) which received the renowned quality seal  HICHERT®IBCS by BARC and HICHERT+FAISST. The high degree of flexibility of the tool 'arcplan Enterprise' made it possible to fully meet all requirements regarding graphics, tables, structures and comments. Since then, arcplan has invested further development efforts in order to simplify the creation of "IBCS compliant" -reports for the user (report developer) and thus save a lot of time and resources in the course of report development.

arcplan 8.5 offers a portfolio of completed and 100% IBCS compliant graphics which can be integrated into the application by a few clicks. Of course, 'Quick Steps' also offers the full arcplan flexibility and can be modified, extended and tailored to the specific requirements.

Read more
Howto: Easy Web Scraping With Python
Howto: Easy Web Scraping With Python

Howto: Easy Web Scraping With Python

Overwhelming Offer in the Webshop

Two weeks ago, a frequently used online mail-order company, whose reminds of a river in South America, called my attention to a campaign by a friendly information email. Namely, three music CDs from a large selection were offered to me for 15€.

As in the past, I still enjoy buying music on physical sound carriers and decided to have a closer look at the offer. It turned out that approx. 9,000 CDs were offered on about 400 pages in the online shop. This shop provides the possibility to sort the offers by popularity or customer ratings. However, if I view the popularity in descending order, I find many titles which do not quite correspond to my age group. On the other hand, if I sort the offers by customer ratings, it turns out that the shop processes the ratings in an unweighted manner. That means that a CD with only one 5 star rating is listed above another CD with 4.9 stars over 1,000 ratings.

Read more