Tuesday, August 31, 2004

"As Is" vs "As Was" Reporting Using Type 2 Dimensions

Mark Rittman has a real good writeup on some dimension modeling scenarios, which are always nice to keep in your toolbox.

Reminds me a little of the classic HR Schema (explained by Ralph Kimball), where the Employee Dimension has as many rows as the Fact table, with the benefit of knowing exactly how theings looked at any point in time in the past.

Sunday, August 29, 2004

GMAILFS - The Google Filesystem

Mark Rittmon points to the GmailFS, a way to use Gmail as file storage, and for it to look exactly that way to your File system.

Jeremy Zawodny posted this as an idea a couple of months ago, looks like someone figured it out.

Saturday, August 28, 2004

The Now Economy: 75% don't have processes in place to take advantage of real-time info...

The Now Economy: 75% don't have processes in place to take advantage of real-time info...: A summary of a recent Optimize mag article, with a ton of good links and commentary.

Friday, August 27, 2004

Comparing Business Intelligence Platforms: Microsoft SQL Server 2000 Analysis Services Vs. IBM DB2 OLAP Server 8.1 & Hyperion Essbase 6.5

Comparing Business Intelligence Platforms: Microsoft SQL Server 2000 Analysis Services Vs. IBM DB2 OLAP Server 8.1 & Hyperion Essbase 6.5 from Progressive Strategies is actually a pretty robust comparison of the platforms, covering ease of use, performance, cost of ownership and scalability in great detail.

The conclusion: Analysis Services is the clear winner. Which is all well and good, but just a little clouded by the fact that one of the co-authors is a Microsoft employee.

Thursday, August 26, 2004

AOL's Second Annual Instant Messaging Trends Survey Shows Instant Messaging Has Gone Mainstream

Just noticed this, popping up all over the place, and it's obviously the catalyst for the news two posts ago. LinuxElectrons has the best write up and summary of the survey, which says among other things:

  • Nearly all surveyed teens and young adults (90 percent) engage in instant messaging, but IM is not the teen phenomenon it was once considered. An amazing 48 percent of those aged 55+ now use instant messaging, with photo sharing their favorite feature. Seven out of ten 22 - 34 year-olds and 55 percent of adults aged 35 - 54 use IM at home, at work or on any number of mobile devices.
  • At-work use is gaining momentum with 27 percent of all IM users saying they use IM in the workplace - a 71 percent increase over last year. Fully 43 percent of employed IM users say they use desktop IM to communicate quickly in the workplace.
  • Nineteen percent of IM users now send IMs or SMS text messages from mobile phones and PDAs, as compared with ten percent that did so last year. Thirty-two percent of these mobile messengers say they stay in touch with co-workers via mobile IM or SMS text messages while on business travel.

Lots of interesting tidbits in the article, it's an interesting read.

Actuate Joins Eclipse, Starts Open-Source BI Project

Interesting... Actuate Joins Eclipse, Starts Open-Source BI Project has all the details.

From my understanding, Eclipse is an Open Source IDE that can support a number of environments, but focuses alomost exclusively on Java. IBM is a huge backer, and pushes the technology.

As for the project itself, from the Project Proposal:

Initially, the Project will focus on leveraging the Eclipse platform to provide infrastructure and tools for the designing, deploying, generating and viewing of reports in an organization. Over time, the creation of additional projects is anticipated and encouraged to address additional aspects of business intelligence, such as Online Analytical Processing (OLAP), statistical analysis, Extract Transform and Load (ETL) tools and so forth.

which seems to focus on Actuate's strengths. Fair enough. As for the planned features:

Reports extract data from a data source or sources, perform manipulations and calculations on the data to answer business questions, and present the results as information in a formatted and convenient form for the business user to use. This information is then typically used for operational or decision support purposes within an organization. Reports vary dramatically in size, content and complexity and will include or combine characteristics such as:


  • Listings of information (Example: Transactions in an account)
  • Sorting, grouping and aggregation of data with and without subtotals (Example: A listing of all product sales for each sales person, grouped by state)
  • Charts to present information in easy to understand formats (Example: Pie chart showing an investor's portfolio allocation by High Growth/Growth/Income/etc. categories)
  • Matrix or cross-tab layouts (Example: Financial budget reports with cost codes as rows, columns for each month, and cells containing numerical data for that cost code/month)
  • Delivery of information as one or a combination of web pages, PDF files, printed documents, Excel files, etc. (Example: Frequent flyer statement delivered as a web page online and a printed document in the mail)
  • Precise, highly formatted layouts (Examples: Bank statements; utility bills; commission statements; invoices; government forms)
  • Page navigation for long reports (Examples: Hundreds of pages corporate cell phone usage bill with First Page, Next Page, Goto Page, etc. buttons)
  • Table of Contents (Example: Multi-page Investment Portfolio summary with Table of Contents to quickly navigate to Account Summary; Fund History; etc.)
  • Keyword or content search within a report (Example: find information on a customer in a 1000 page customer account report)


All of a sudden it's pretty unexciting. A yawner in fact. Standard reporting suite (except no mention of integrating with a directory for authentication and securing reports?) with, a little suprisingly, nothing that would seperate it from any othe suites out there (probably by design, since reporting is Actuate's bread and butter.)

Looks to me like an announcement of an intention to provide an alternative to Reporting Services which matches its price.

AOL Instant Messaging Survey Lights The Way For New Marketing Opportunities

MediaDailyNews writes:

Instant messaging (IM) is now a mainstream communication platform for Web users, according to new data collected by America Online and Opinion Research Corporation, and the new usage figures are prompting the world's largest Internet Service Provider to aggressively market the Instant Messaging client as a unique opportunity for marketers.

The article talks to directly to a for-fee service called "Expressions" that will allow AIM users to customize their client (and allow AOL to benfit from both sides of the deal), but it's clearly the first step into widespread marketing to the IM population.

I spoke to AOL last week about bots, in preperation for one i'm working on for a film coming up, and they are on board their also, although the terms are still much less defined. I'm certaimnly glad to be in the loop though.

Tuesday, August 24, 2004

Scanning the network for SQL Server

Scanning the network for SQL Server has a couple of simple scripts to find SQL Server installations on the network. It's very important to do this in a large company from time to time, as there are always new installations popping up.

Great idea, except that the scripts in the article won't cut it. Totally inadequate. Instead, use sqlscan.exe from the SQL Server 2000 Security Tools. It will scan every machine within an IP range and identify all MSDE installations, neither of which the scripts in the article will do.

Forbes.com on "Cheapware", or the open source option

Forbes.com - Cheapware:

"Craig Murphy has had enough. As chief technology officer at Sabre Holdings, which runs the world's largest airfare and ticketing network. Murphy has spent millions of dollars on database and other software from companies like Oracle. But last year, when Sabre was building a new computer system for online shoppers. Murphy took a flyer on a database program from a little-known company in Sweden that charges only $495 per server computer, versus a $160,000 list price for Oracle. Guess what? The Swedish stuff works great. Fired up, Murphy is hunting for other places to use the cheaper software, called MySQL.

"We're just not going to pay license fees for those databases like we used to. We'll download free stuff off the Internet before we do that," Murphy says. "I believe this is the future of computing."



Wow, that's rough. And it's happening over and over again, and I imagine after a profile in Forbes, it will just accelerate.

What are you doing to consider MySQL? Story via Sadagopan.

Mapping objects to relational databases

Here's an old but good article if you haven't seen it before. Mapping objects to relational databases is an IBM developer works article that describes some approaches useful in database development.

Found via the ETL Guy.

Mark Rittman: Looking To The Future With Microsoft And Oracle OLAP

Looking To The Future With Microsoft And Oracle OLAP looks at Microsoft's and Oracle's OLAP offerings planned for Oracle 10g and SQL Server 2005. Great overview that I found very useful!

Monday, August 23, 2004

Integrate MSN Messenger into Project Server

Microsoft Windows Messenger Integration Custom Project Guide is a set of tools that will add in the capability on Project Server to see people's IM status on MSN. If implemented and used correctly it can be very effective for dispersed teams, as just viewing the Project website will show at a glance which team members are at their desk (and available for a quick conversation) and which ones are away.

This is a trend in a few MS products, including Sharepoint Portal Server, and Live Communication Server gives you the means through an API to add the presence capability to other applications. It's definitely a trend that wil continue and eventually become a given.

Adding such presence features to other projects actually isn't that tough. Yahoo especially makes it very easy: a simple URL is all that's required to see if anyone is online or not. Just replace my screenname with anyone else's and add dynamically into any web app:
-
<img src=&quot;http://opi.yahoo.com/online?u=dlbham">

It's also easily possible with AIM, except you must specify your own Online and Offline images:

<img src="http://big.oscar.aol.com/Your_Aim_Name?on_url=Location_Of_Your_Online_Image&off_url=Location_Of_Your_Offline_Image" border = 0 >

Unfortunately I have never come across an MSN version of these that can be as easily integrated into a webpage. There are several webpages that allow you to check if someone is online, but they all contact MSN on the IM protocol level to "finger" the status of an address.

Unearth the New Data Mining Features of Analysis Services 2005

Unearth the New Data Mining Features of Analysis Services 2005 is a writeup on the data mining features in the upcoming beta 2 of SQL Server 2005 (formerly formally named and still referred to as Yukon). Has a couple of screenshots and some explanations of the extensions to the little-used "DMX", or Data Mining Extensions for SQL, which allows you to query Data Mining Models for results.

Here's an example of what a DMX statement looks like:

SELECT NewCustomers.CustID, PredictProbability(Churned, True)
FROM CustomerChurn NATURAL PREDICTION JOIN
OPENQUERY ('My Datasource', 'SELECT * FROM NewCustomers') AS
NewCustomers
SELECT * FROM CustomerChurn.CONTENT

Looks like there's alot of great tools coming up in SQL Server, time will tell if they deliver the goods.

Thursday, August 19, 2004

Build a Browser-based OLAP Reporting Solution - from Only4Gurus.com

Build a Browser-based OLAP Reporting Solution is a PPT plus zip of sample code from an MCSD Technet session. It is a good walk through and uses Office Web Components to show a pivot table delivered over the web.

pocOLAP

pocOLAP - the %22little%22 OLAP-project is a J2EE application designed for reporting. Here's the demo.

It's not an OLAP server, just pretends to be one. It's open-source and free however, so perhaps worthy of a test in Java environments.

Found on Javalobby via Feedster.

Wednesday, August 18, 2004

SCO's stock

I noticed this via a comment on Slashdot... SCO Unix, known primarily at the moment for claiming ownership of code included in Linux, has over 55% of its outstanding shares on the market shorted. Wow. There aren't a whole lot of people that see any hope for the company and its lawsuits apparently.

Saturday, August 14, 2004

Fuzzy Lookup and Fuzzy Grouping in DTS for SQL Server 2005

Fuzzy Lookup and Fuzzy Grouping in Data Transformation Services for SQL Server 2005 on MSDN takes a look at the "Fuzzy Lookup Transformation" and "Fuzzy Lookup Grouping transformation" (those are the real names of the DTS objects) in Yukon's DTS.

The purpose is to leverage general search algorithms to reduce complex code for the ourposes of lookups and data cleansing. Very interesting, and shows that Microsoft is serious about making DTS a full-fledged ETL tool.

In another article I saw a week ago that I can't locate right now, an analysts pointed out that there will be many new features in SQL Server 2005, but they will require alot of training to take advantage of. All things told, that's pretty much the case with SQL Server 2000, but it looks like SQL Server 2005 will be especially so.

Friday, August 13, 2004

Multiple Grains in a Hierarchy

Multiple Grains in a Hierarchy explains one of the features in Yukon'sAnalysis Services, and what it means, and how it can allow some excellent detail where needed, and lesser detail (and lower performance impact) when not needed.

Thursday, August 12, 2004

Linux In the Mainstream

A couple of Linux stories... Enterprise Systems | BI-on-Linux Goes Mainstream talks about Cognos, Actuate, and Siebel's decision to support Linux ports of their products.

IBM's mainframe momentum continues on CNET is about IBM's win in with a European company to migrate all of it's 19 SAP applications on 2 mianframes running virtual linux servers. Although goos news for IBM's mainframe division, the real story to me is how virtualization is going mainstream. The same is possible with VMWare or VirtualPC on Intel platforms, and is worthy of consideration. The same advantages are in both environments.

Tuesday, August 10, 2004

Web Analytics Services

Web Analytics is an area of analytics probably a little bit farther ahead of the curve than most. Web Analytics Services is a survey by Network computing magazine that covers 10 options that promise big results:

We wanted to identify site-visitor behavior and determine the reasons for that behavior. Also crucial to us was further end-user interaction and the support for non-IT business and content owners. We also considered, though did not test, integration with other business-information sources. We evaluated each product based on visitor identification, campaign information, path-navigation analysis, e-commerce report writing, administration, integration, training and performance.

Link comes by way of Nettakeaway.

MSN Web Messenger

Late to the punch, but big news nonetheless in the mainstream. MSN has released an official Web Messenger that allows you to sign in and have an MSN session from anywhere you have access to a web browser.

This is useful in a number of scenarios... for example, Atlanta Freenet is a wireless community network serves the public at a coffee house I sometimes go to. Trouble is, it allows web and mail traffic, but blocks MSN. Many companies also try to block IM, often in vain. This won't make it easier.

AIM has had this service for some time (although it is a java plugin). And MSN has too... although all were done by hobbyists and are harrassed from time to time by MSN. For several years, sites such as E-Messenger and MSN2Go have been offering such a service for free. Initially set up as a way to access MSN through a mobile WAP connection, they quickly were jumped on by others. Based on the buzz and personal contacts I'm exposed to, the vast majority of the users are teenagers, reflecting the IM population as a whole. Number one use? Accessing IM from schools that block normal traffic.

And apparently it's a little battle in some schools. The sites get blocked of course (as they do in many companies), so new sites constantly pop up. Oviously, MSN is reacting to a need others are filling right now, but I doubt these rouge services will go away.

In fact, it's a bit silly... the young, who seem to effortlessly be able to handle being constantly connected, are on to something that has more positive than negative consequences, both in the real world and in the business world.

Microsoft SQL Server 2000 Index Defragmentation Best Practices

Microsoft SQL Server 2000 Index Defragmentation Best Practices - from Only4Gurus.com is a 23 page document that contains just about everything you want to know about index defrags. Very detailed, and covers different approaches for small and large environments.

Monday, August 09, 2004

Rensselaer Data Warehouse Project

Just stumbled across the portal for the Rensselaer Data Warehouse Project, and thought I'd link it. The documentation of the warehouse is comprehensive and impressive, with full project plans, lots of pretty star schemas of the financial analysis mart, and full metadata definitions.

For someone starting out trying to get their mind around a data warehouse project, it can be tough visualizing it from beginning to end, and guessing how theory gets applied in the real world. The huge source of information around this live project should help a great deal.

Blog News

When I initially created this blog, I intended it to be focused on a couple of subjects I know better than the "average" and I hoped that linking articles and writing my opinion on a few things might help other people interested in learning about the same things. The two subjects I chose to focus on were anything BI related (with a focus on Microsoft options), and on IM bots. Data Warehousing, OLAP reporting, Data Mining, BAM etc are subjects that my company does and sells every day, so naturally that has been the bulk of my posts. IM bots were just a hobby I stumbled into but after I released an open source project that made bots easier to write, I started to put alot more thought into it.

After some recent developments with IM, it will become part of my regular job, and I've decided to post more about IM subjects in general, but in particular IM bots and IM for use as a marketing vehicle. Both get very little coverage accessible to the general public. I visit Stowe Boyd's Get Real for mainstream and enterprise related IM news, but the subjects I want to cover are much too niche and under the radar. This won't mean I'll post less on the BI world, just more on IM.

The last few weeks IM bots have suddenly taken up alot more of my time. A number of people have sent in bug fixes for my project, and each was using my project in novel ways. For example, there is a college in Nebraska using it to send project management deadline alerts, and in a strange twist analytics software powerhouse SPSS is experimenting with the bot as part of its MSurveys service.

Lastly, over the last couple of weeks I have been talking to a PR firm in Hollywood interested in setting up an IM bot for a couple of upcoming films. This is pretty exciting for me, as I've been preaching in a few disconnected places the untapped potential IM has for a marketing platform. I personally believe marketing over IM will be huge in the future (the numbers, population demographic, and lack of penetration SHOULD be a marketer's dream). I happen to be closer to the cutting edge in this area than most, so I'll write about my experiences a bit more.

If I was smarter and had more time, I'd figure out how to have categories with seperate RSS feeds for IM and BI/OLAP related posts, but until I do, the posts will all be intermingled. My apologies to anyone that finds it a pain to sift through what they don't wish to see.

Thursday, August 05, 2004

Survival Data Mining

Intelligent Enterprise has an article this month titled Survival Data Mining for Customer Insight. If you haven't tried these techiniques out, be sure to read.

Survival data mining is...
In the medical world, doctors often want to understand which treatments help patients survive longer � and which have no effect at all (or worse). In the business world, the equivalent concern is when customers stop being customers. This is particularly true of businesses that have a well-defined beginning and end to the customer relationship. A good example is a subscription-based relationship, which may be found in a wide range of industries including insurance, communication, cable television, newspaper and magazine publishing, banking, and newly competitive utility markets.

The basis of survival data mining is hazard probability: that is, the chance that someone who has survived for a certain length of time (called customer 'tenure') is going to stop, cancel, or expire before the next unit of time. This definition assumes that time is discrete, and such discrete time intervals � days, weeks, or months � fit business needs. By contrast, traditional survival analysis in statistics usually assumes that time is continuous.

Slashdot | CA Dangles $1M Bounty for Ingres Conversion Tools

Via Slashdot... CA has released Ingres for download to the masses with an announcement at Lunuxworld. Also, CA is launching a one million dollar contest to reward creators of database converter programs (to Ingres).

Wednesday, August 04, 2004

Caveat Migrator: BI Changes Abound in SQL Server 2005

Enterprise Systems | Caveat Migrator: BI Changes Abound in SQL Server 2005 has a bit of a different take on the BI features in 2005- ie, get ready for a learning curve.

All indications are that Microsoft takes seriously the need to educate customers about SQL Server 2005. The company recently announced a new SQL Server 2005 “Ascend Program” that offers would-be adopters extra training and hands-on lab support to help with their migration efforts.

Database market shares in EU, ME, and Africa

Gartner just put out a report with this title. A brief summary:

Sales of this software in Europe, the Middle East and Africa recovered in 2003, rising by 11 percent. IBM captured 43 percent of revenue. Oracle was second with 26 percent, and Microsoft third with 18 percent.

Tuesday, August 03, 2004

A Hyperdrive could be in your database's future

Everyone knows disk access times are on the most important measures of database performance, and since it is often the bottleneck on a system, improvements can have a dramatic improvement. Expensive SCSI and SAN systems all seek to improve performance, sometimes for significant cost for not much of an increase in benefit. It's a best practice (especially with RAM prices nowadays) to max out RAM on a database system (especially with Oracle, which has features to manage memory eficiently), but how about replacing the disk entirely with RAM? Not creating a RAMDisk in system memory, but using it seperately? This is what the Hyperdrive claims to deliver.

From the description:

Reads and writes 80x faster than IDE hard drives.
"Instant on" after BIOS POST.Win XP installs in 4 minutes.
Up to 16GB of space on DIMMs.
CDROM Form factor - your BIOS (and OS) see it as another hard drive (albeit a small one).
Independent power supply to retain data on shutdown.
Integrated 160min battery backup in case main board fails.
IDE socket on the board to auto backup the Hyperdrive to the hard disk on power failure.

Is it a good alternative for databases? Seems like a possibility to me, and the price ($700) is not outrageous.

I came across this link from some gaming friends who passed around reports that the new DOOM 3 apparently runs instantly using the Hyperdrive. That is extremely high praise. For someone out there, it's worth a shot in the database availability world.

Monday, August 02, 2004

Hacking on the Web Services platform

Michael Platt posted Google, eBay, Amazon and Web Services which links to an Information Week article about the value eBay and Amazon are getting from their web services initiatives, which open up their API (and thus a large portion of their databases) as a publicly available interface for anyone inclined to use them. The article mentions that eBay reports that use has exceeded expectations, and the web services are now queried more than 30 million time/day...

With the success of these, and of Google's web service API, the possibilities of applications get stretched in directions never possible before. With these sources, the possibilities look more eccentric than useful (such as Googling for something and having eBay results shown that are tied to the search results) but as more companies open up their information stores, massive opportunities will arise. As DW, alalytics, and integration experts, many in this audience could spot opportunities based on what is involved in this field of work now.

Because when web services of this type become ubiquitous, the classic opportunities of combining similar data from different sources arise.... arbitrage, hedging, spreads, all of those things. Somewhere along the line fortunes will be made from spotting that things selling for $x in one service can be bought for $2x from another service, and will automate the whole mess to identify and execute the opportunities when it happens. It's done to day in the stock market and in the international money market, maybe soon it will be done with tshirts and baseball cards. Or with business services even. It's tough to see through the fog and predict just how all these services can be strung together to create a huge opportunity, but if you were ever looking for one, there it is.

Sunday, August 01, 2004

OSCON 2004: Open Source GIS with GRASS

Phil Windley is at OSCON 2004 and has a short review of the sessionOpen Source GIS with GRASS. There is a link to the slides of the talk which do a good job of explaining what GRASS is all about.

Looks like a pretty steep learning curve to me... everything is command line driven (which is great for automation, but very annoying when authoring a solution to a real world problem). There are also several examples of working with site/point data and finding routes. The tutorial authors know there stuff, and are proposing a book on using mapping software.

GIS systems can be very expensive, so this is an alternative if you absolutely have to have a free solution, and have alot of time to spend learning a system with thin documentation and becoming, quite possibly, the sole expert on the application.

A better solution is MapPoint, which is by all accounts an underused jewel in the rough. Easy to use, can get data directly from an RDBMS or OLAP source (and text files or Excel spreadsheets), has pretty zoomable maps in a GUI, and can be purchased for well under $250.