Thursday, April 29, 2004

Intelligent Enterprise Magazine: Scalability Challenges for Large Databases

Intelligent Enterprise Magazine: Contents Under Pressure discusses the growth of very large databases in businesses around the world, based in the "Top Ten Winners" study from Winter Corporation. Some telling statistics about SQL Server:

The most striking evidence of the emergence of Windows as a platform for large database processing, however, occurred among transaction-processing databases. Unix dominated Windows in OLTP in 2001, taking nearly 60 percent of databases that qualified our measurements (a quarter of the submissions ran on z/OS and OS/390). However, by 2003, Windows-based databases accounted for more than two out of five OLTP databases (43 percent), propelling Windows into a virtual tie with Unix as the most widely used transaction-processing platform, according to our program data. Moreover, the largest Unix OLTP system held 5.4 TB of data, barely more than the 5.3-TB Windows database reported by Verizon Communications. This implementation, which runs Microsoft SQL Server on HP ProLiant servers and Symmetrix disk arrays, signifies a twelvefold jump in the size of the largest Windows transaction-processing database that we assessed.

There is also a very interesting interview with Amazon about how they get business value out of their 15 TB data warehouse (running on Oracle).

BI Scorecard: Excel Integration is another good article about the integrzting BI tools into an Excel front end in an effort to prevent Excel users from importing their own extracts from the data warehouse into it and unwittingly producing multiple versions of the truth.

Good articles there today - and I would not have been led their if Makr Rittman had not referred me to Coping With Data Warehouse Growing Pains .

Wednesday, April 28, 2004

SQL Server Performance Management

There's a nice pdf on the Embarcadero site on SQL Server Performance Management with the Embarcadero Space Analyst, but there is plenty of information to monitor your own counters. Covers ratio-based analysis, how to spot bottlenecks and I/O problems, session resource consumption, etc.

Corporate Performance Management Assessment

Gartner has a Corporate Performance Management Assessment that requires free registration.

Tuesday, April 27, 2004

Building A Laptop RAC Installation Using VMWare

Mark Rittman's Oracle Weblog: Building A Laptop RAC Installation Using VMWare is something I'll have to try out. Pretty cool! The promise of RAC is intriguing - enterprise-class high availability and performance on commodity white box/blade server architecture. Add capacity at will, and pull nodes out and replace them without interruption.

I looked into RAC once at my old job and decided against it. When I asked our Oracle reps for references we could visit within a couple of hundred miles, they only found two. One had two nodes, one had 3, and both used nodes that had 8 CPUs and 8 gb of RAM each. Not exactly the commodity setup I was hoping for.

But the promise is still good enough to keep a close eye on. I asked a MS rep what he thought of it, and he seconded what I saw, that RAC right now doesn't fully deliver what it promises. He also told me in confidence that as soon as Oracle does get it right, SQL Server is in big trouble.

Thursday, April 22, 2004

XMLA overview

ThinOLAP Home has a few short articles on using XMLA, a vendor independent way of using web services to query OLAP and data ming models instead of vendor-specific APIs. It is supported by Microsoft, Hyperion, SAS, and a few smaller vendors right now. Oracle is a hold out on adopting the standard, going for the JOLAP specification instead.

Here's a SAS presentation with a pretty easy to read overview of XMLA. And Mark Rittman has a great summary on XMLA vs JOLAP.

Tuesday, April 20, 2004

MySQL Notes

O'Reilly Network: Why MySQL grew so fast (news from the 2004 MySQL Users Conference) [Apr. 19, 2004] tries to answer the question that's on alot of people's minds. It points out some interesting facts:


  • MySQL AB claims an installed base of five million systems, the largest of any database engined.
  • The mysql.com domain sees almost as much traffic as ibm.com.

MySQL was the first database I used and learned on (just like nearly everyone that ran a hobby site did in the 90s). The largest it got was 120mb, a prestty decent size for that time. It was a very, very good database for that purpose back then, and has improved a great deal since. And after having worked for years now with SQL and Oracle, I can say this is definitely not something mainstream DBAs want to get with their pants down on, as it has definitely not stopped it's march into the data center yet. At the least, don't let this quote describe you:

MySQL, first of all, illustrates in almost pure form the sequence of events Clayton M. Christensen documented as a "disruptive technology" in his ground-breaking book The Innovator's Dilemma. Early versions of MySQL lacked the basic features, such as ACID transactions and referential integrity, that experienced users expected from a relational database. In a pattern familiar to anyone who has read Christensen's book, knowledgeable observers dismissed MySQL as a toy.

It is definitely not a toy.

Data: The Forgotten Pillar of Basel II Compliance

Data: The Forgotten Pillar of Basel II Compliance in DMReview talks a little about the Basel II Capital Accord , which Banks around the world are expected to comply with. The accord is fairly complicated, but essentially has 3 "pillars", monitoring Credit Risk, Market Risk, and Operational Risk and reporting risk exposure in order to hopefully avoid large financial groups failing and disrupting global finances. The Credit Risk Pillar (Pillar 1) involves running analytics to determine probability of default, loss given default, exposure at default, and effective maturity for different asset classes and borrower grades.

From paragraph 407 (p. 74), "the bank must demonstrate that it has been using a rating system that was broadly in line with minimum requirements articulated in this document for at least the three years prior to qualification," and this is where the challenge is occurring for many Banks, as they must find this historical data and make sense of it. These skills combined with understaning the accord are currently in very hgh demand.

SOA resources

Michael Platt, an MS Architect Evangelist from the UK, has posted a pretty complete list of resources and documentation on Microsoft's implementation of a Service Oriented Architecture.

And on that note, here's a link to some articles about the recent panel discussion at the Software Development 2004 conference in Santa Clara on the subject of "Software Trends: Marrying SQL, XML, Web Services and Grid Computing." Representatives from Oracle, Sun, IBM, Microsoft, BEA, and eWeek atended.

Sunday, April 18, 2004

Something colmpletely different... Professional Video Game Players

In the late 90's I was part owner of a mildly popular group of websites that covered professional gaming worldwide, so I keep up on these things every now and then, as I feel I'm a bit of a witness to the growth of something I think will one day be a major worldwide sport. Things are moving very quickly in that direction now with the latest news that for the first time, a professional group in Europe recruited Starcraft players from Korea. And the years of living off of scraps may be over for the best players according to ESReality:


The largest transition in esports history has been upon us, the successful pro-gamer Nal_rA got a wooping $200,000 for switching to team KTF. Nal_rA has been doing very well recently, getting to the finals of the three recent tournaments he has participated in and winning two of those. Together with NaDa he is generally considered the best Starcraft player at the moment. KTF also recruited the prominent zerg player YellOw, giving him a salary of $85,000 per year. So KTF are looking very strong, with players such as the aforementioned Nal_rA and Yellow, as well as Chojja, Reach and Sync.

Shortly after Nal_rA and Yellow joined KTF, Boxer's team, 4U, announced a new sponsor. It is the largest telecom company in Korea, SK Telecom, who will be sponsoring the team. The contract with Boxer is said to be even better than the $200,000 Nal_rA got, but no numbers have been mentioned. Apart from that salary, the team will also get a 3550 sq office and a 12 seat van.


Interesting stuff. I'm user #11 on that site, seems so long ago.

Leveraging Predictive Analytics in Marketing Campaigns

Leveraging Predictive Analytics in Marketing Campaigns speaks in very general terms of hoe to get value out of historical customer data with predictive analytics, something that will be much more important (and accessible) in the future.

Saturday, April 17, 2004

Panorama - Business Intelligence Software

Panorama - Business Intelligence Software is getting some hype lately. It's a front end to Analysis Services that uses DHTML exclusively on the client side according to the documents. I want to take a free trial. Screeshots.

Panorama has an extremely close relationship with MS, as the original version of Analysis Services was based on technology acquired by MS from Panaorama.

Thursday, April 15, 2004

Download details: Microsoft SQL Server Best Practices Analyzer Beta

Download details: Microsoft SQL Server Best Practices Analyzer Beta: "Overview
Microsoft SQL Server Best Practices Analyzer is a database management tool that lets you verify the implementation of common Best Practices. These Best Practices typically relate to the usage and administration aspects of SQL Server databases and ensure that your SQL Servers are managed and operated well."

Actually does a very good job if you haven't run across it, and just a 4 meg download.

Wednesday, April 14, 2004

Mark Rittman's Oracle Weblog: Is The Future Of Data Warehousing "Distributed Intelligence"?

Mark Rittman's Oracle Weblog: Is The Future Of Data Warehousing "Distributed Intelligence"?: "Michael Carter has just put together an article, 'The Death Of Data Warehousing', that again questions the need for a data warehouse, but in this case proposes replacing it with decentralised, distributed 'Business Information Networks' that answer questions for particular departments or functions, and communicate using web services. According to the article:" ......

Very good write up, and I agree with alot of it. Good links to presentations also, which bring to mind all kinds of questions (if Oracle are experts on integration/CRM/etc, then why on earth couldnt my sales rep ever give me a list of all the licenses my company was paying support on? And why wasn't it available anywhere online? And if supporting open standards is so important, why is Oracle the the only major IM vendor not part of the major group pushing IM integration?) but that is way off on a tagent. The writeup itself discusses a proposal to forget about the mammoth central DW idea and stick with Data Marts.

The author has a point, in that about every DW suffers from flaws - they take too long to update, control is too restricted, and following the best practice of every Data Mart feeding from that warehouse leads to huge delays that line managers who are responsible for profit and loss dont want to wait for. For those managers that can affect profit given immediate access to information, I have to agree, yes, data from their own systems closest to them needs to be immediately available, and that a slight different version of the truth is an acceptable trade-off for speed of action. They can have a much more positive effect on revenue than the ivory tower can, and need that intelligence as close to the action as possible.

But the DW is still invaluable for overall reporting, and especially for data mining. The limitations of speed and availability are technological limitations, some of which are imposed by vendors (prohibitive, incremental license models for front ends, for example). As CPUs, disk access, and networks become faster that delay will grow shorter and shorter. It may take many years before it becomes timely enough for a sales manager to act on a trend though, but the technology will be there soon enough.

Mark Rittman's blog is probably the only blog that I read every single word of what gets posted.

Introduction to MySQL Cluster (Jeremy Zawodny's blog)

Introduction to MySQL Cluster (Jeremy Zawodny's blog): "First we're seeing an overview of the NDB architecture. If you've never seen it before, think 'Oracle RAC without shared storage' and you're 95% of the way there.
The core NDB engine is a new storage engine inside MySQL. It provides transactions, replication, on-line backups, crash recovery, hash and tree indexes, on-line index builds, auto-detection of a failed node and re-sync when it comes back up. There are rolling upgrades, which provide a way to upgrade things without a disruption of service."

Jeremy has many more entries from the MySQL 2004 conference, hitting every session he attends. Interesting stuff, MySQL is without a doubt on the radar as a DB to watch for potential DW applications. 0$/CPU + lightning fast selects = excellent reporting server...

Tuesday, April 13, 2004

Reporting Services OLAP Sample Reports

RS2K OLAP Sample Reports demonstrates using Reporting Services, the newest free add-on to SQL Server 2000, to produce reports from an Analysis Services cube. Here's also a draft white paper on the same subject.

Reporting Services is being reported everywhere as a viable replacement for Crystal Reports and other reporting suites, it's worth a look. "Free" does require some interpretation... Visual Studio .NET is required to author the reports (but not to view).

Monday, April 12, 2004

Data Mining Worst Practices

This Way Failure Lies is a Database-neutral article in DB2 Magazine on how NOT to approach a Data Mining project. Many good tips in this area which is still a bit of black magic.

And by the way, any DW/BI people should look into the DB2 Business Intelligence Certification from IBM. It's 2 tests , one with generic DW concepts, and the entry level DB2 test. They are fairly easy for someone with experience, and a good base that will open doors in some places. The Study guide to the BI test (free) is a very good introduction to data warehousing, better than just about any non-free book I have seen.

Free clients for MS Analysis Services cubes

I mentioned it in the last article, so I'll list a few here that I have run across:

Some are very nice, some very basic, but you can't beat the price in a pinch, or when putting together a proof-of-concept in your company.

Sunday, April 11, 2004

Intelligent Enterprise Magazine: BI on a Budget

BI on a Budget talks o companies wondering why most BI solutions cost an arm and a leg for software licenses, and some alternatives. Of course, Analysis services is at the top of the list, even though it doesn't mention that chances are, most mid-size and larger companies probably own several copies of it already. Some other comments:



  • Syncsort is mentioned as a low-budget tool... but the old, dated, but still relevant alternative of Perl is not. As well as PL/SQL for Oracle.
  • No front ends are mentioned, curiously. Microsoft has several free or cheap ones (all that work with Analysis Services), including Data Analyzer, the thin web client, and pivot tables.
  • Web services (external) are mentioned as a solution to some external sources, including demographic data. The author doesn't realize an incredible amount of this data is available for free from the US Census. A talented developer can deliver every proximity calculation, market peneteration, or any other demographic measure you can dream up with this data. (tell them to look up "great circle."
  • Also, Mappoint has a free OLAP plugin (for Analysis Services, of course) that makes magical eye candy.
  • There are many other very reasonably priced fron ends, my favorite being Databeacon.

Intelligent Enterprise is one of my favorite magazines lately, always has some relevant articles. And even better, they also have an rss feed.

SQL Server Yukon Data Mining Tutorial

SqlJunkies: "SQL Server Yukon Data Mining Tutorial " has a white paper on SQL Server Yukon Data Mining. Of course, having the beta of Yukon helps a great deal, but there is still a great deal of good info, for example, a description of the BI Workbench:


Business Intelligence Workbench is a set of tools designed for creating business intelligence projects. Because the workbench was created as an IDE environment in which you can create a complete solution, you work disconnected from the server. You can change your data mining objects as much as you want, but the changes are not reflected on the server until after you deploy the project.
Working in an IDE is beneficial for the following reasons:
• You have powerful customization tools available to configure the workbench to suit your needs.
• You can integrate your Analysis Services project with a variety of other business intelligence projects encapsulating your entire solution into a single view.
• Full source control integration enables your entire team to collaborate in creating a complete business intelligence solution.
The Analysis Services project is the entry point for a business intelligence solution. An Analysis Services project encapsulates mining models and OLAP cubes, along with supplemental objects that make up the Analysis Services database. From Business Intelligence Workbench, you can create and edit Analysis Services objects within a project and deploy the project to the appropriate Analysis Services server or servers.


Many screenshots of the workbench and wizards. Microsoft is really going to upset an industry or two by including all this in the database product for free (again).

Saturday, April 10, 2004

Calculating Costs of a Data-Mining System

Calculating Costs of a Data-Mining System in Baseline magazine. The Excel spreadsheet is very good with alot of detail. Requires a free registration.

Social Computing Symposium Talks Online

Social Computing Symposium Talks Online for some info on a hot topic. Lots of well done short videos.


There's also a few other gems from past Multi-University/Research Laboratory (MURL) Seminar Series talks including:



And More

.NET Charting Capabilities w/sample code

Companion Content for Programming Microsoft ASP.NET has the code from the book. Chapter 22 has some sample code demonstrating a dynamic bar chart and pie chart based on pulling a query. The results, with some shadow and 3d effects added in are as good as any dashboard offering I have run across. Definitely, definitely worth experimenting with. Screenshot:




This is all with about 20-30 lines of ASP.NET code, with no extra libraries past the .NET framework on the web server.

The Visual Thesaurus, a Dictionary of the English Language

The Visual Thesaurus, a Dictionary of the English Language. Check it out, nice application of some data mining priniciples.

Monday, April 05, 2004

Mark Rittman's Oracle Weblog: New Developments In Oracle Data Warehousing

Mark Rittman's Oracle Weblog: New Developments In Oracle Data Warehousing has a perspective on real-time data warehousing. Very detailed and a good read with a link to several presentations on the subject.

Sunday, April 04, 2004

textually.org: Free Text-Timing For Marathon Runners

textually.org: Free Text-Timing For Marathon Runners tells about how friends can be alerted by SMS when a runner crosses certain check points. It's already standard practice to do this via the web, so this is a clever extension. Probably many many simple extensions like this. Maybe quarterly updates to football games?

Saturday, April 03, 2004

Craig Lamb & Singletary, Inc.

I have left my former position as VP of Database for Compass Bank to start my own company with two partners, Bill Craig and Ashley Singletary. The name of our company is Craig Lamb & Singletary, Inc, and we will provide robust BI and OLAP solutions using Six Sigma methadology to businesses nationwide, with a special focus on Bank's and Banking Processes.

Combined, we have a wealth of knowledge, including deep experience in BI implementations, SQL Server and Oracle backends, data movement, and Black Belt certified Six Sigma skills.

Our first major push for business will be at the Appro Systems's user's conference, where we will demo the trending and reporting solutions we have developed for Appro products, including MP100 and Loan Center.

We are looking forward to great things!