Data Mining News

Leo Breiman quote about statisticians

Datamining and predictive analytics - Fri, 09/03/2010 - 05:37
One nice thing about having to move offices is that it forces you to go through old papers and folders. I found my folder containing KDD 97 conference notes, including quotes in the tutorial by David Hand from Leo Breiman (1995):
One problem in the field of statistics has been that everyone wants to be a theorist. Part of this is envy - the real sciences are based on mathematical theory. In the universities for this century, the glamor and prestige has been in mathematical models and theorems, no matter how irrelevant.I love this quote because it highlights the divide between the practical and the elegant or sophisticated. Data mining and predictive analytics are "low-brow" sciences, empirical, and practical. That doesn't mean that the mathematics aren't important; they are very much so. But while we wait for the elegances of a theory to trickle down to us, we still need solutions.

In courses I teach, one of my objectives is to take the mathematics of the algorithms and translate the practical meaning of what they do into understandable pieces so that practitioners can manipulate learning rates and hidden units, gini and two-ing, radial kernels and polynomials kernels. Understanding backprop isn't important to most practitioners, but understanding how one can improve the performance of backprop is very much a key topic for practitioners.

We need more Breimans to pave the way toward practical innovations in predictive modeling.

Data Journalist David McCandless

Nice talk by David McCandless on culture and data visualization.

An additional interesting take on military spending would be the relationship between the country in question and the statistics of the other countries in the world as military intentions are, in part, external (in part of his presentation, David talks about the importance of providing relative views of data rather than absolutes).

The Recorded Future is Here

Recorded Future is a new venture which mines the web for statements that are associated with some time expressions. It then uses this corpus to describe the future in various geographies for various topics. In addition to the application of information extraction methods, they also present this information in creative visual displays.


 

The site is plenty full of jQuery goodness, but I did find the newbie experience a little puzzling (how do I navigate to the data visualization? not clear...)

Finally, I loved this quote from a satisfied customer:

"This definitely reduces time in figuring out what may or may not be happening in the future based on what has been happening in the past. It cuts that time in half. "

Advertising Executive

[HT Sundar]


 

Predictive Models are Only as Good as Their Acceptance by Decision-Makers

Datamining and predictive analytics - Wed, 08/25/2010 - 06:39
I have been reminded in the past couple weeks working with customers that in many applications of data mining and predictive analytics, unless the stakeholders of predictive models understand what the models are doing, they are utterly useless. When rules from a decision tree, no matter how statistically significant, don't resonate with domain experts, they won't be believed. Arguments that "the model wouldn't have picked this rule if it wasn't really there in the data" makes no difference when the rule doesn't make sense.

There is always a tradeoff in these cases between the "best" model (i.e., most accurate by some measure) and the "best understood" model (i.e., the one that gets the "ahhhs" from the domain experts). We can coerce models toward the transparent rather than the statistically significant by removing fields that perform well but don't contribute to the story the models tell about the data.

I know what some of you are thinking: if the rule or pattern found by the model is that good, we must try to find the reason for its inclusion, make the case for it, find a surrogate meaning, or just demand it be included because it is so good! I trust the algorithms and our ability to assess if the algorithms are finding something "real" compared with those "happenstance" occurrences. But not all stakeholders share our trust, and it is our job to translate the message for them so that their confidence in the models approaches are own.

The Spectrum of Time Series Forms

I've been playing around with time series data recently. Seeing the many forms that these take, I was planning a post describing a bestiary of time series. This, it turns out, is too much work, so here I have collected some examples that I hope, as a collection, demonstrate at least a small part of the vast spectrum of forms that time series data can take. (See an earlier post on using HTML5 to create this type of output).


 










  


   
 

What are the Odds?

I'm walking down a street in Bangkok and I bump into a colleague from my place of work in the UK. I'm hiking in the hills near Kathmandu and meet someone from my birth town. A friend take a flight and ends up sitting next to the mother of someone in his daughter's class at school. What are the odds?

When we think about calculating the probability of this happening, by factoring in things like 'what is the probability that the mother was taking this flight', 'what is the probability that she would have the seat next to me', etc., we often think that we are dealing with extremely small probabilities. But like the famous birthday pairs question, this overlooks an important part of the coincidence.

The odds to be calculated are not the odds of the specific incident, but the odds of something remarkable happening. In other words, I would be equally likely to report as 'amazing' sitting next to the mother of my child's classmate, sitting next to someone from my home town, sitting next to someone who bought a lamp at my garage sale, etc. The space of things that we would consider remarkable is actually far larger than we think, especially when we are confronted with a specific case.

Visualization of Multichannel Forensics: Wired Magazine, "The Web Is Dead"

The mine that data - Fri, 08/20/2010 - 03:15
Take a peek at the article from Wired Magazine (The Web is Dead). Look at the image at the top of the chart.

This is what we are seeing with classic catalog and e-mail marketing in the majority of my Multichannel Forensics projects ... the new channels come, but they don't fully replace the old channels, leaving business leaders in a bit of a pickle.

Fortunately, you can make a boatload of profit by cutting back on old-school advertising to customers who have made the transition to newer channels!

And by the way, read each viewpoint in the article ... please, read each viewpoint. An evolution that is going to swamp e-commerce is well articulated on each side of the spectrum in the article.

Building Correlations in Clementine / Modeler

Datamining and predictive analytics - Fri, 08/20/2010 - 00:58
I just responded to this question on LinkedIn, Clementine group, and thought it might be of interest to a broader audience.

Q: Hi,
Does anyone have any suggestion or any knowledge on how to make cross-correlation in the Modeler/Clementine?

A:

 I'm not so familiar with Modeler 14, but in prior versions, there was no good correlation matrix option (the Statistics node does correlations, but it is not easier to build an entire matrix)

The way I do it is with the Regression node. In the expert tab, click on the Expert radio button, then the Output... button, and make sure the "Descriptions" box is checked and run the regression with all the inputs (Direction->In) you want in the correlation matrix. Don't worry about having an output that is useful--if you don't have one, create a random number (Range) and use that as the output. After you Execute this, look in the Advanced tab of the gem and you will find a correlation matrix there. I usually then export it and re-import it into Excel (as an html file) where it is much easier to read and do things like color code big correlations.

New Nordstrom Website: The Evolution of E-Commerce

The mine that data - Thu, 08/19/2010 - 03:20
Click here for a preview of the new Nordstrom e-commerce website. Take a look at the difference in the way that merchandise / "the brand" is presented to the customer.

In spite of what the trade journals and conference agendas communicate, e-commerce is under siege.

History has a way of providing us with a forecast for the future. In the 1970s, Catalog Marketers leveraged "big books" ... some of you remember these, Spiegel, Montgomery Wards, Sears, Penney, 600 page monsters that offered the customer "everything". These brands exploited the "long-tail" thirty years before the term became trendy.

In the 1980s, we had "specialty catalogs" ... smaller catalogs from Lands' End / L.L. Bean or tens of thousands of catalogers that were possible because of the magic of database marketing ... science made it possible to send a targeted merchandise assortment to a targeted audience ... clearly, this was a far more profitable proposition than sending every single item to every single customer.

In the 1990s, e-commerce bursted onto the scene. In the embryonic stages of e-commerce, you needed offline advertising to drive traffic online. In other words, you needed small vehicles (catalogs, e-mail) to drive traffic to large vehicles (e-commerce website).

In the 2000s, we learned all about the "long-tail". E-commerce went the way of the 1970s catalog, once again, you had to share everything with the customer. In the last decade, technology fused search (on-site search and Google/Bing) with a "long-tail" based website, so clearly the end result is different than in the 1970s, but the concept holds ... it was again fashionable to aggregate everything under the sun, having 20% of items driving 80% of sales while finding ways to make the remaining 80% of items profitable. Good luck to the inventory manager responsible for managing long-tail inventory!

In the 2010s, the pendulum is swinging back to the 1980s ... this time, Mobile is the vehicle that is driving the change. In the 80s, the computer decided who received a smaller, targeted assortment. In the 10s, the customer and the computer will use Mobile to "go small" once again. Mobile demands that the merchant edit the assortment ... in fact, Mobile is pointless unless the merchant uses Mobile to significantly edit the assortment for the customer. Combine Mobile with localization (Foresquare / Facebook Places), and we're going really small now, aren't we?

E-commerce is the "big book" catalog of the 1970s, and it will be forced to evolve in order to compete with Mobile. You are going to hear the pundits talk about a "multi-channel" solution ... they will tell you that Mobile and E-Commerce are Peanut Butter and Jelly ... just like Catalog Marketers who said that Specialty Catalogs and E-Commerce were like Salsa and Chips back in the 2000s.

Mobile and E-Commerce are not Peanut Butter and Jelly. Mobile is going to cause a fundamental transformation within E-Commerce, one that many E-Commerce experts are not ready to deal with.

I predict (and I clearly have a good chance of being wrong) that E-Commerce will become far more entertainment-based, and far more social ... it has no choice, it has to evolve given the simplicity and personalization offered by Mobile. I sincerely believe that E-Commerce will look more and more like a highly polished cable television program over time ... I believe that E-Commerce will get a layer of frosting that goes on top of a crowded, link-based, sku-intensive website that is explored via search. Without this, the customer will chose the simplicity of the Mobile presentation. The history of Catalog Marketing points us in this direction, doesn't it?

Take a peek at the evolution of the Nordstrom website, and tell me if you think they are headed in that direction, or share your thoughts in the comments section if you think I'm nuts ... and if you think I'm nuts, send links to facts that support your personal hypothesis about what you think will happen in the future!

Gliebers Dresses: They Keep Coming

The mine that data - Thu, 08/19/2010 - 03:15

From: Meredith Thompson [mailto:meredith.thompson@gliebersdresses.com]

Sent: Wednesday, August 18, 2010 10:55 AM

To: Kevin Hillstrom

Subject: How Are You?


Dear Kevin:

Remember me? It's Meredith. Meredith Thompson! How the heck are you doing? I heard that you are working with some footwear company on some random island on the West Coast, what is that like?

I don't know if you heard the news or not. Brandon Templeton was fired as CEO of Gliebers Dresses a few weeks ago. He had this "all-in" strategy, I guess he liked playing Texas Hold 'Em or something, and decided that we had to go "all-in" and completely abandon our catalog marketing strategy. We went "all-in". And now I am "all-in" when it comes to a liquidation strategy because we missed our sales plan by 50% in July.

Was the kid a knucklehead? Absolutely. Was the kid trying to push us into the future? Probably. We fought this kid every step of the way, doing everything possible to make sure that the kid didn't kill our beloved catalog.

You have to understand, Kevin, that we love catalog marketing. I can't think about merchandising a landing page, that's boring, who wants to do that? Now, when it comes to putting together sixteen pages that tell the story about why a sundress is vital to the summertime lifestyle of a 58 year old woman, well, that is something I know about, I have a passion for this style of marketing and merchandising.

This new generation of "digi-dudes" as I call them, well, I'm not certain what they have a passion for. They batter old-school marketers like me, but they seem completely inept when it comes to creating demand. Sure, anybody can do an A/B test on a landing page to determine which style of creative generates orders more efficiently, but can these digi-dudes create the demand that sends a customer to a landing page? I doubt it. I sincerely doubt it. A generation of marketers are losing the battle on demand generation. Our kid-wonder former CEO learned this lesson the hard way. You can have apps for the Android platform, for the iPad, for the iPod, for the iPhone, for Blackberry, for Raspberry for crying out loud. How the heck do you tell customers to go and download the app? I mean, honestly, should we expect a couple hundred thousand twenty-nine year olds to virally share the fact that we have a series of apps with their closest friends?

Brandon didn't have an answer for that.

I'll use the catalog to create demand. Until somebody comes up with a better idea, I'll use the catalog to create demand. There is no such thing as multi-channel marketing without the demand generation provided by catalog marketing. E-mail sure doesn't count, though to be fair, it is better at generating demand than an app for an Android phone. Banner ads? Please. Re-targeting? Who wants to be stalked online? The catalog, now that thing creates demand!

I'll wrap this up, now. I sure wish somebody would bring you in here to spend some time with us. Roger is making a pitch to be interim CEO. The last thing we need is a daily essay on insights from Woodside Research. We need somebody who is practical enough to know that the past is still relevant, while pushing us to test new strategies. Almost everybody we know is good at one or the other, we need somebody good at both old and new. Who might you recommend to be our CEO? Is there anybody out there who might be willing to lead us? Let me know your thoughts.


Best,

Meredith

Summer Segmentation: Primary and Secondary Channels

The mine that data - Tue, 08/17/2010 - 03:15
Sometimes you have too many marketing channels. A customer could receive a postcard in the mail, then visit your website via paid search, then purchase via an affiliate.

You could credit this order to the affiliate (last touch). You could give credit to the postcard for creating demand (matchback). Or you could allocate the order 1/3 - 1/3 - 1/3 to each channel (attribution). Hint: There isn't a right way to do this, there are only wrong ways to do this. You cannot get in the head of the customer and ask her brain to properly allocate the order, can you?

Ok, you've done your best to allocate the order. Now you have to record the order in your database. Is this an offline customer (buys via postcard), or is this an online customer (buys via an affiliate), or is this a search customer?

In your database, you can set up "primary" and "secondary" channels. The "primary" channel might be the affiliate, since that is the actual marketing channel where the customer placed an order. You can also give full credit to the postcard as a "secondary" channel, and to search as a "secondary" channel.

Once a customer purchases for a second time, you begin to get a clear view of the preferred "primary" channel, and you begin to get a clear view of the preferred "secondary" channel.

Among customers who have purchased 2+ times, segment customers based on their favorite "primary" and "secondary" channels. This will give you better insight into what the customer is likely to do in the future, allowing you to better allocate marketing dollars as appropriate.

Dear Catalog CEOs: Modern Catalog Contact Strategy

The mine that data - Mon, 08/16/2010 - 03:15
Dear Catalog CEOs:

Many catalogers are using the same contact strategy that was employed sometime in the late 1980s. We structure our businesses around the contact strategy, staffing to the level of contacts that we pre-authored twenty or more years ago.

And we like "big" contacts, don't we? The printing vendor community provides efficiencies for having bigger page counts. The USPS provides efficiencies for having specific, larger page counts. Our merchants demand larger page counts so that the entire merchandise assortment can be presented to the customer.

All of these activities work against the behavior of the modern customer.

The modern customer (under the age of 55) is going to buy online, regardless whether you mail catalogs or not. We always over-state results by adding orders to catalogs that would happen online anyway, with or without a catalog mailing.

Beyond that, however, we fail to properly analyze page counts.

Here's an example that I run into every day. A company has a 148 page catalog that generates $2.50 when mailed to the average housefile customer.
  • Pages = 148.
  • Cost = $0.74.
  • Demand = $2.50.
  • Profit = $2.50*0.35 - $0.74 = $0.14.
In this situation, we'd mail the catalog, heck, it was profitable, right?

Let's try something different. Let's create a 64 page catalog, editing out only the best sellers from the 148 page catalog. The cost of mailing the catalog, on a per-page basis, is 15% more expensive. We'll use the square root rule to estimate demand at (64/148)^0.5 = 66% of the 148 page catalog:
  • Pages = 64.
  • Cost = $0.37.
  • Demand = $2.50 * 0.66 = $1.65.
  • Profit = $1.65*0.35 - $0.37 = $0.21.
What is so interesting about this is that few folks will actively create the smaller catalog, even though it is more profitable. And among folks who do create the smaller catalog, people will circulate the larger catalog to break-even, and then will transition to the smaller catalog.

Why toss so much profit into the dumpster?

Even better, why not create a targeted version of the smaller catalog? Put in the best product within one merchandise division, and send it only to customers who previously purchased from the merchandise division? When you do this, productivity often increased by 20%.
  • Pages = 64.
  • Cost = $0.37.
  • Demand = $2.50 * 0.66 * 1.20 = $1.98.
  • Profit = $1.98*0.35 - $0.37 = $0.32.
This ends up being the most profitable version!

A modern catalog contact strategy will include many small catalogs, with targeted merchandise sent to a targeted audience. The days of a larger catalog mailed to the entire audience are over, it is an unproductive way to generate mediocre levels of profit.

Contact me now to obtain an evaluation of your contact strategy!

IBM and Unica, Affinium Model and Clementine

Datamining and predictive analytics - Fri, 08/13/2010 - 15:25
After seeing that IBM has purchased Unica I have to wonder how this will effect Affinium Model and Clementine (I revert to the names that were used for so long here, now PredictExpress and Modeler, respectively). They are so very different in interfaces, features and deployment options that it is hard to see how they will be "joined": the big-button wizard interface vs. the block-diagram flow interface.

One thing I always liked about Affinium Model was the ability to automate the building of thousands of models. Clementine now has that same capability, so that advantage is lost. To me, that leaves the biggest advantage of Affinium Model being it's language and wizards. Because it uses the language of customer analytics rather than the more technical language of data mining / predictive analytics, it was easier to teach to new analysts. Because it makes generally good decisions on data prep and preprocessing, the analyst didn't need to know a lot about sampling and data transformations to get a model out (we won't dive into how good here, or how much better experts could do the data transformations and sampling).

My fear is that Affinium Model will just be dropped, going the way of Darwin, PRW (the predecessor to Affinium Model), and other data mining tools that were good ideas. Time will tell.

Gliebers Dresses: Didn't Expect This E-Mail

The mine that data - Thu, 08/12/2010 - 03:15

From: Roger Morgan [mailto:roger.morgan@gliebersdresses.com]

Sent: Wednesday, August 11, 2010 6:03 AM

To: Kevin Hillstrom

Subject: Robotics


Kevin:

We haven't chatted in a long time, so I thought I would reach out to you. I hope you are doing well.

I understand that you have been working with a company on the west coast that is making good use of robotics in the distribution center. Would you be willing to share with me what this company is doing, how they are doing it, what it costs, and what they perceive the competitive advantages are of their robotics system? Our budgets are tight, so we're not going to pay you anything for this, we just thought maybe you'd be willing to spend a half-day or so jotting down your thoughts, you know, something that doesn't take too much time. Thanks in advance for your help, we appreciate it.

You might have heard that Brandon Templeton is out as CEO of Gliebers Dresses. This certainly isn't a surprise, I mean, this guy violated just about every best practice in the book in his quest to, as he would say, "modernize marketing". Woodside Research recently published a report that suggests that, by 2013, customers will hold up to six mobile devices at one time while leveraging offline marketing and e-mail marketing to make purchase decisions, requiring marketers to be nimble, sophisticated, and savvy at using multiple channels in a synergistic manner. Clearly, Mr. Templeton didn't read the research report, or he wouldn't have decided to obliterate our catalog marketing program in a short-sighted attempt to demonstrate the viability of emerging channels.

I am hoping that you might be able to assist me. Fitz Gleason, the gentleman who owns Gleason Investments, our parent company, is actively searching for our next Chief Executive Officer. As you already know, it sometimes takes a long time to find a viable Chief Executive Officer ... Woodside Research states that the average time it takes for a company to find a CEO is about eight months. I would like the opportunity to showcase my talents to Management. I would like to become the Interim CEO, and with luck, I could demonstrate that I deserve to be the permanent CEO of Gliebers Dresses.

You are already familiar with my strong strategic mindset. Nobody is going to come to the table better prepared than I am. I challenge any operations leader to match my knowledge of multi-channel marketing. You know as well as I do that it is critical to subscribe to and purchase the papers issued by all of the leading research organizations. Though I haven't been given the opportunity to run a marketing program, I possess a thorough knowledge of all multi-channel marketing best practices. Heck, I know why it is important to optimize the number of pixels in an e-mail pre-header. I've read Seth Godin, so I know how the lizard brain fights against making strategic changes to a marketing organization.

Outsiders might suggest that I don't have the merchandising experience necessary to grow a business like Gliebers Dresses, but I disagree with that point of view. Neptune Research suggests that merchandisers that will win in the future source product in a nimble and rapid manner, responding to customer demand in real time. I believe I can move our organization in this direction in a frictionless manner.

Anyway, I am hoping that you would be willing to call Fitz Gleason and put in a good word on my behalf. I know that you are respected in the industry, so I am confident that your thoughts would go a long way toward helping advance my career objectives.

Let me know when you have had a chance to speak with Mr. Gleason. I am heading to Lake Winnipesaukee for the weekend, I will be available next week if you have any questions.

Thanks,
Roger Morgan

The iPad & Mobile

The mine that data - Wed, 08/11/2010 - 03:15
Last week, I purchased an iPad.

This week, I view the world in a different way. I read books via the Kindle App, a better experience than reading a physical book ... heck, I can see areas in the book where other readers highlighted important facts. The Weather Channel App is fantastic. I can search any radio station in America that is playing a certain song using the TuneIn App, hop on, and listen to that song, or I can play my entire music library on the device. I can use the device as a mobile GPS platform with the 3G connection. I can listen to my hometown radio station while traveling. I can watch a streaming movie with the Netflix App.

Of course, it's the user interface that makes the iPad and the coming onslaught of competitors different. As if a page was taken out of the movie Minority Report, your finger becomes the mouse. The "app" fuses a computer program with easy website navigation.

Many of you are reading this and saying "duh" ... you've owned an iPhone for years, you know all about this.

Many of you are reading this and saying "boring ... the iPad is a clumsy laptop, I can do all of that online right now, the iPad is an expensive and functionless toy."

I will say this. If you perceive the device to be different, then the device is different. And that's all that matters. Folks who view the device as being different create apps for the device that are different, or use apps in a way that is different from the traditional web experience.

As Ben Stiller said in "Night at the Museum", "... there's a storm comin', buddy."

A whole chunk of the e-commerce / online channel is getting ready to break off, sort of like the giant iceberg that broke off of Greenland this week.

There are ramifications.

If you are an e-commerce brand, how do you decide which of your 12,000 skus deserves to be featured in an app? Or does the app even bother focusing on the best 1% of skus, instead seeking to solve a customer problem over selling the customer merchandise?

If you are a publisher that makes money from selling ads, what do you do when you lose 30% of your homepage traffic to an app that does not monetize ads as effectively?

If you are Barnes & Noble, what do you do with debt-ridden stores that house paper books when a third of your former store customer base is using the Kindle or a Kindle app on the iPad? Even if you have your own device or you have your own app, you still have to cover the costs of your debt-ridden stores ... right? How do you do that?

If you are a web analyst, do you try hard to be an expert at analyzing what happens at http://www.weather.com, do you become an expert at a new generation of software that will inevitably appear to analyze mobile transactions, or do you become an expert at analyzing how all online and offline channels fit together? It's a relevant question, one I hope you are spending time pondering.

If you are an e-mail marketer, do you optimize a channel that is in slow decay? Or do you jump into mobile and be the "conduit" between old-school marketing tactics and apps?

If you are a catalog marketer, do you focus on harvesting every last penny out of the 64 year old Upstate New York customer who loves to shop via paper in the mailbox? Do you spend the 15 free minutes you have each day fussing over whether the model on the back cover of the catalog is 'brand appropriate', or do you lead your company into the future by creating the most innovative publishing/magazine app that conveys all of the subtleties of merchandising/creative that are utterly absent from modern e-commerce?

I have no idea how all of this will turn out. I can only see, from my experiences, that I've changed ... and I've had the device for a week. What happens when 40 million households have a similar and far more affordable device?

As Ben Stiller said in "Night at the Museum", "... there's a storm comin', buddy."

Summer Segmentation: E-Mail Response

The mine that data - Tue, 08/10/2010 - 03:15
If you want to have some fun, create two variables in your database.
  1. Recency of click-through from an e-mail campaign.
  2. Recency of purchase after clicking through an e-mail campaign.
You'll find that those on your e-mail list that don't record activity in either variable in the past twelve months have very little value ... to the e-mail channel.

You'll find that those on your e-mail list that do record activity in either variable in the past twelve months have a different future trajectory than do other customers.

Show of hands ... how many of you have either variable actively coded on your database?

Are Facebook likes Flooding the Internet?

I recently enabled the Facebook 'like' feature on this blog, which is hosted by SixApart's Typepad service. Of late, I haven't been blogging at anything like the rate I'd like, but - O happy day - my traffic (according to Typepad) has been increasing quite a bit. Which blogger wouldn't be happy to see this:


While I might have suddenly become more relevant, I suspect the reason that I'm getting this increase in traffic (except for that large peak, which is legitimate) is related in some way to Facebook's 'like' feature.

Consider the following from my traffic details in my Typepad dashboard:

Generally, this is to be read 'a visitor came from www.facebook.com/plugins to the page with the path /data_mining/datamining.' But the pattern is too predictable and not easily explained by a human visitor.

A possible explanation of the problem is the following: when someone visits a view of my blog which involves aggregates of posts (say, visiting the home page, which collects the most recent 10 posts), the Facebook 'like' button gets rendered. Facebook wakes up and decides to pull the page - somehow leaving behind this plugins reference. Unfortunately, this seems to be happening almost every time, rather than in a sensible, cache supported manner.

I'm pretty sure I don't have all the details right. For example, when I look at the two other services which I use to track traffic, they don't appear to register these references from Facebook. Does this indicate that it is something to do with the setup between Typepad and Facebook? Or perhaps it is some issue with how Typepad collects and displays visits. Perhaps the  other two services are incorrectly removing these references? Generally, when a robot crawls your site, it doesn't leave an indication of where it came 'from' as it would just be fetching from a list of effectively arbitrary URLs. Does that indicate it is some sort of crawler faking identity as a human user?

The Facebook API documentation says:

When does Facebook scrape my page?

Facebook needs to scrape your page to know how to display it around the site.

Facebook scrapes your page every 24 hours to ensure the properties are up to date. The page is also scraped when an admin for the Open Graph page clicks the Like button and when the URL is entered into the Facebook URL Linter. Facebook observes cache headers on your URLs - it will look at "Expires" and "Cache-Control" in order of preference. However, even if you specify a longer time, Facebook will scrape your page every 24 hours.

At anyrate, while I'd be happy to be getting the increased traffic, I'd rather get accurate traffic reports.

Does anyone have any insights? Anyone from SixApart or Facebook?
 

Dear Catalog CEOs: Pages and Contacts

The mine that data - Mon, 08/09/2010 - 03:15
Dear Catalog CEOs:

These days, many of you contact me, hoping that there is a secret to generating more demand with catalog marketing. Can you find better names in our housefile? Can you encourage the co-ops to create better models for us? Is there a list out there that we are missing that would cause our business to grow by 30%?

Sometimes, the answer is right under our nose.

Pretend that you mail a 124 page catalog every month.

Pretend that this catalog, when mailed to an above-average performing customer, generates $5.00 of demand per mailing.

What would happen if you mailed two contacts a month, each contact at 64 pages?

I can already hear folks howling at me. "We don't have the creative resources to mail two catalogs instead of one." "You have to show the entire merchandise assortment or the customer won't buy across the entire merchandise assortment!" "Our customers don't want increased frequency."

Math, believe it or not, can suggest that the strategy might be more productive.

I start with the 124 page catalog at $5.00 per customer. I'll use the friendly "square root function" to estimate what might happen at 64 pages ... (64/124)^0.5 * $5.00 = $3.59.

So, each 64 page catalog will generate $3.59. Now, let's assume that the second contact cannibalizes the first contact by 30%. Therefore, the two contacts will generate $3.59 + $3.59*0.70 = $6.10.

Now, we're getting somewhere! Let's assume that it costs 15% more, per page to mail two 64 page catalogs than it costs to mail one 124 page catalog. Instead of costing, say, $0.60 to mail 124 pages, it will cost (64+64)*1.15*(0.60/124) = $0.71.

Time to run the profit and loss statement. Assume that 35% of demand flows-through to profit:
  • 124 Pages = $5.00 * $0.35 - $0.60 = $1.15.
  • 64 Pages + 64 Pages= $6.10 * 0.35 - $0.71 = $1.43.
You tell me, which strategy would you rather employ?

Catalog marketing is evolving in very interesting directions. Your best catalog customers, the rural 65 year old woman, for instance, can support more contacts, contacts that are smaller. You marginal customers require fewer contacts --- they are going to shop online now, so a smaller catalog can be used to save expense, increase circulation depth, and still drive customers online.

Contact me now
for a contact strategy evaluation. Let's put modern page count changes and contact strategies into practice!!

Mobile's Not Zen

How cool is this? The picture below is from the Twitter Map overlay on Bing Maps. It shows images tweeted from the Rush concert at White Water Amphitheater last night.


Some of the tweets from the concert: "Solo still going", "Yay drum solo", etc.

During the concert, the audience was lit in part by the ethereal glow of mobile devices held aloft to capture the spectacle.

To me, none of these people are living in the moment. It is as if they are instituting a homunculus - a little agent which is accounting what they are doing 'ok, now I am watching a drum solo', 'I'm watching this performance through an 'eye' that is watching this performance.'

Oh, and my neighbor's seat was row 21, seat 12.

Netflix: Classic Multichannel Forensics

The mine that data - Sat, 08/07/2010 - 21:33
You have to love Multichannel Forensics when you consider the case of Netflix v Blockbuster, don't you?

Best of all is the fact that DVD rentals are down 25% year-over-year, while streaming of movies is up from 37% of subscribers to 60% of subscribers this year ... and oh, by the way, subscribers are up 50% year-over-year.

So many of us in the catalog industry failed to capitalize on disruptive technology. We were completely misled by a vendor industry pushing a multichannel platform that protected their business, not our business. As a result, we kept mailing catalogs, hoping the catalogs would cause customers to buy online. We failed to resonate with an entire generation of folks, who are now age 40 and younger, representing a huge cohort (especially under the age of 30) that the cataloger simply can't easily reach.

And now, we have a generation of e-commerce experts who are about to make the same mistake with mobile. The analytics folks are saying that you are best off using old-school web analytics software with modifications to analyze how mobile and social yield a customer with "multichannel" characteristics. Does that story sound familiar? The story is percolating everywhere, folks.

You wonder if disruptive technologies are best utilized in an independent manner, not an integrated manner. The catalog generation tried hard to integrate old-school techniques with the web, without success. Retailers tried to integrate the web experience with stores, but didn't factor in how heavy debt loads in retail would cripple the retail experience with a small drop in same store sales. And now, the e-commerce generation gets ready to take on mobile. Will the mistakes be replicated?
Syndicate content