A nice list of Open Source databases
Decisions, Decisions... lists a number of them with links, including VistaDB, Cloudscape, and the usual suspects.
Stuff that interests me that I have some trouble finding info on... initally all about Instant Messaging (esp bots) and BI/OLAP.
Decisions, Decisions... lists a number of them with links, including VistaDB, Cloudscape, and the usual suspects.
Optimize Magazine > Technology Innovation > BAM Keeps A Finger On The Pulse > January 2005- nice article about what BAM is and how its principles will infuse into everything in a large company.
From Mosha Pasumansky: Analysis Services 2000 vs. 2005: "I have run across a presentation called �Analysis Services 2000 vs. 2005� prepared by Jaimie Basilico and Mark Frawley (Jamie works in Microsoft as Senior Technology Specialist in the East Cost, and he is one of the best field people in Analysis Services that we have). This presentation is targeted towards people who are familiar with Analysis Services 2000 and want to come on speed with Analysis Services 2005. I have found the presentation very useful, but not all subjects are covered in the same depth. Below are my comments"
Kent Tegels points out The Bloggers Guide to BizTalk. It comes in the form of a downloadable Windows Help file, and all I can say is wow! Very nicely done, and very organized. A very good introduction if you aren't familiar with this product want want a quick crash course, with the ability to get into the areas of detail you want.
SQL Server 2005 - Interface Overview from Database Journal. Don't have acces to SQL Server 2005 beta, but want to see what all of the new screens look like? Steven Warren posts a bunch, everything from the new start menu to all the admin screens that come with it.
Enterprises Warming Up to Firebird Open-Source Database is an article on EWeek that summarizes a recent survey by Evans Data Corp. Among the findings:
My last post on SOA topics today...
It Official: no XQuery for Whidbey, and that's still fine. from Kent Tegels breaks the news. Whidbey is the beta version of the next version of Visual Stuidio .NET. Xquery was way too complicated, making simple SQL statements a mess. The idea of using XML for communication is, of course, not a bad idea, but the best way, imo, is to just adobt the ODBS model. All you need is four elements - hostname, login, password, and SQL statement. That's it. Return back a result set and an error code. That's it.
Slashdot | LAMP Grid Application Server, No More J2EE points to ActiveGrid, a LAMP grid implementation that sounds very cool indeed.
Wow what an informative article. Database Benchmarking on Database Journal is written by Steve Callas, an Oracle DBA, and covers all of the major benchmarking tests performed by The Processing Performance Council, or TPC, and explains why you have to take all of the results with a grain of salt. Great background for the alpha geek that explains the nuts and bolts behind the popular benchmarks.
The Florida Education Data Warehouse GIF from the B-EYE Network. It's a nice write up of a failed implementation of a data warehouse, which was saved by an implementation of Bill Inmon's Government Information Factory, with a nice color coded picture.
Just got this in my Sourceforge update...
Only a few days left to test your Java skills.. We are giving out ten
40GB iPods and 50 (count 'em) SourceForge.net T-shirts. Entries have
to be in by the end of January 2005. Cloudscape, by IBM, is a
powerful Java-based SQL database that sports a small footprint (a few
megabytes). Since it's Java based, it's cross-platform. IBM recently
open sourced the database under the name Derby. The contest is easy --
simply download the database and our special data files and answer the
question we email you (it's a simple SQL exercise). If you are able to
get the correct answer from the datafile you'll be placed in a drawing.
What are the odds? Likely pretty good. It depends on how many people
enter the contest and have the right answer. Only a few days left.
Enter now.
Note: You have to be a U.S. or Canadian (except Quebec) resident to
enter (sorry, blame our legal folks). To enter go here:
http://sourceforge.net/cloudscape_contest.php
InfoWorld: New worm targets MYSQL installations. It was bound to happen sooner or later:
The new version of Forbot infects machines by taking advantage of administrator accounts with weak or nonexistent passwords. The worm cracks the accounts by trying values from a predefined list of around 1,000 possible passwords, Ullrich said.
........
To be infected, MySQL has to be configured to allow the root account to log in remotely to the system. By default, the root account is only allowed to log on at the machine running MySQL, rather than remotely. The root account also has to use a password that is on Forbot's list of passwords, Ullrich said.
Christa Carpentiere points to Partitioned Tables and Indexes in SQL Server 2005, a needed and welcom feature that could be part of SQL Server 2005. SQL Server has never really had such a feature, an important one, and a concrete reason why Oracle is often a better option for very large databases. Of course, you could fake it with filegroups and views, but that's just what it is - faking it.
About this paper The features and plans described in this document are the current direction for the next version of the SQL Server. They are not specifications for this product and are subject to change. There are no guarantees, implied or otherwise, that these features will be included in the final product release.
That means the open source route: Grid Computing Takes the Linux Route:
Monday's launch of the Globus Consortium by HP, Intel, IBM and Sun Microsystems represented the second body devoted to the commercialization of grid to come into being in the past year, after the Enterprise Grid Alliance launched in April.
Why do we need yet another grid outfit? Besides the EGA, we already have the Globus Alliance, as well as a smattering of bodies that work on grid standards, including the Global Grid Forum, OASIS (Organization for the Advancement of Structured Information Standards) and the World Wide Web Consortium.
The Globus Consortium, however, is specifically devoted to advancing open-source implementation of grid standards as the world of grid opens up to commercial use. The group is focused on advancing the Globus Toolkit, an open standards building block for enterprise-level grid implementations that came out of the Globus Alliance, an open-source-focused organization at Argonne National Labs.
What a great article, as I was just wondering how to go about doing this. Connect to Lotus Domino using SQL Server Linked Server is worth a bookmark, as Lout Notes is very common in enterprises, but gaining access to it's data can be quite a challenge, and usually involves third-party tools of some kind.
Technical Comparison MySQL 4.1 vs. Microsoft SQL Server 2000 - from Only4Gurus.com is a pdf report prepared by A23 consulting. It is pretty unbalanced (claiming for example that MSDE is a perfectly appropriate free version of SQL Server, apparently interchangeable with SQL Server for all intents and purposes, thus canceling out any "free" advantage MySQL may have) but an interesting read nonetheless. It is very, uh, gratuitous.
Lucas Jellema notes that � Oracle Business Intelligence Spreadsheet Add-In Available for Download:
With this add-in, it becomes much easier to make use of the OLAP functionality in the Oracle 9iR2 and 10g database. Wizard-driven from Excel, end-users with only a little training should be able to perform analysis that otherwise would require the use of custom built applications or the configuration of Analysis tools such as Discoverer.
The good thing about trying a new project is, much like writing an article or teaching a class, it forces you to research alot of things you thought you knew well.
The Need for Better SQL Server Backups covers something normally a yawner of a topic, and it would seem to be far too entry-level for DMReview, but this article has some gooc points alot of similar articles don't cover, including using a third-party tool to speed up and compress backups (saving alot of space - my favorite is SQL Litespeed), and a good argument of the need for other third party tools for backup management, as Enterprise Manager doesn't cut it for a large number of servers.
Christa Carpentiere has the link to SQL 2005 Webcasts - Q&A. Lots of links to webcasts there, plus Q&A transcripts, covering everything from full text search to reporting service to notification services. Lots of useless info in the Q&A but a few interesting things, such as there might be a query analyzer for MDX in the next build of SQL Server 2005.
Just popped up on my Feedster feed. Never let it be said Google is solely a MySQL shop :) Google: Oracle DBA (Business Intelligence Apps)
Google is looking for an outstanding Oracle DBA who will play a major role in the development of business intelligence applications. This is a unique opportunity, in that we are in the early stages of forming a team. The person hired into this role will wear many hats: database architect, capacity planner, backup and recovery architecture designer, performance tuner, and of course database administrator responsible for production and operational support.
Marco Russo points to Business Intelligence Portal Sample Application for Microsoft Office, which is a new release. This actually looks very cool (screenshots are in the docs).

I attended an Oracle Tech Day yesterday in my city, and one of the sessions I went to was the Oracle 10g Grid presentation. I already knew much of the information, and I thought about a few things I've posted on recently, mainly about the differences between white-box clustering (used by Google, Yahoo, Microsoft, and others) and what I'll call traditional, shared sapce, SAN dependent, enourmously expensive solutions. Both require a level of technical knowledge just way too high for most. Vendor solutions are very expensive, but the white-box, commodity approach has to be customized to the app in question,as it usually involves multiple partitions over many boxes.
This has been noted at a few mainstream outlets already...PostgreSQL 8.0 is the latest version of this open-source database, largely considered a competitor to MySQL in the open source world. It's claim to fame in the past was that it had alot of advanced features, including triggers and stored procedures, unlike MySQL.
Although tested throughout our release cycle, the Windows port does not have the benefit of years of use in production environments that PostgreSQL has on Unix platforms and therefore should be treated with the same level of caution as you would a new product.
For anyone reading this feed, via a browser or (more likely) on RSS, you may wonder what I look like. You'd know I like databases, and business intelligence, own my own business that provides those services, and have done some work for some first class clients, plus I'm an open-source advocate, IM bot creator, not to mention I have a technical blog, so you may have developed a mental picture of me. So, compare your mental image with a pic:

Here's an online guide to OWB 10g. It's hands-on, allows you to install all the software on your machine, and takes about 4 hours. Can't turn down some free training.
In any large company, technical people often do their own solutions to get problems solved quickly. A very, very common case is an engineer or CPA in production or finance that learns just enough Access to whip up a nice little automated solution for something(because they couldn't get anyone in IT to listen to them). But very quickly, the little access program becomes indispensible to daily business. More things get added on, and before you know it stores over a Gig of data and is so slow it takes 10 minutes to run a query.
Just noticed this, although they are over a year old...Microsoft SQL Server: SQL Server 2000 for the IBM DB2 Customer Kit.
Slashdot points to an interview with the MySQL CEO: Open source & MySQL will rise, legal foes will fall:
"What challenges do you see facing businesses that are going to start using more open source software in 2005?
Mickos: We deal a lot with enterprise customers, and we ask them what problems they foresee and what questions remain unanswered. Their No. 1 concern is training the staff. They are asking themselves whether they need to retrain people or whether they have the skills in-house already.
The good news is that most corporations discover, when they ask around, that they have open source skills in-house. That is an important milestone for the open source movement. Many corporate IT people have used open source products at home or, sometimes secretly, in business projects.
Of course, formal training may still be needed. That is the big hurdle that large organizations need to jump as they adopt more open source."
Mark Rittman points out New Oracle Business Intelligence 10g Training Resources:
The lessons use the Oracle Business Intelligence samples that you can additionally download from OTN, and take the form of simple steps and screenshots to walk you through creating workbooks, analyzing data, building reports and so on. I've worked through them myself as a starting point for putting some demos together, and they're a pretty comprehensive look at the new features.
High Availability and Scalability Enhancements covers some of the features SQL Server 2005 will support, including real, no-faking table partitioning, and support for adding and detecting physical memory on the fly without shutting down.
Business 2.0 :: Magazine Article :: In Front :: Mark Cuban's End Game:
"Even more heretical is Cuban's opinion of DVDs, which is that they suck -- or, at least, that they're inferior to hard drives as a medium for storing digital content. 'Why would we invest in DVD,' he asked, 'knowing that hard drives are going to grow in capacity, shrink in size and price, and can also be erased and rewritten?' He imagines selling HD movies stored on key-chain drives -- or putting multiple films on larger drives, 'like software used to be packaged on PCs.' Moreover, he added, 'with ever-expanding storage, we can increase picture quality for years to come by taking advantage of new cameras and better compression schemes. With DVDs, we can't.' "
The New York Times > Technology > Young Cell Users Rack Up Debt, a Message at a Time talks about one of my favorite subjects, instant messaging.
Text-messaging has flourished for years in Europe and Asia, where it is immensely popular among young people. In the United States, activity was limited until 2002, when a breakthrough in the wireless market allowed short text messages to be sent among customers of the major cellular carriers. Previously, customers could send messages only to those who used the same carrier.
The service, known as S.M.S. (for Short Message Service), has since taken off. According to a recent report from Forrester Research, a company in Cambridge, Mass., that specializes in technology, Americans sent 2.5 billion text messages a month in mid-2004, triple the number sent in mid-2002.

Simon Saban points to a signup page for the SQL Server 2000 SP4 Beta. SP3a has been the standard for about 2 years now and is generally considered pretty stable, but there's over 200 bug fixes in SP4. The only addition I spotted related to new diagnostic settings for identifying some problems.
Sadagopan's weblog on Emerging Technologies,Thoughts, Ideas,Trends and Cyberworld points to this Wired article about indexing pictures and video.
A group of European researchers are developing technology that can recognize everyday objects in digital images. The image-processing software looks for "key patches" in an image to determine the relative positions of different shapes, such as tires and a car body, or a beach and ocean waves, to categorize the image's contents.The software has learned hundreds of objects since development began in 2002, and can be used to categorize images and automatically create image tags. The software can look for images similar to those it has already scanned and "knows,". The software is currently being tested on a variety of images, and the researchers continue to add new object categories. Companies such as clothing stores or sporting goods companies would jump at the chance to have a Google image-search result in pictures displayed with their products.
IBM's Pervasive Media Management group is developing visualization software that can identify objects contained within one of the web's fastest-growing content categories - video streams. The software identifies groups of objects within a frame to form concepts that can be easily searched, such as an airplane with a cloud and sky backdrop that would be categorized as travel.Categorizing the content of video through human labor can take 10 times as long as the duration of the content, as per IBM. The software can be trained to recognize images by providing it with a group of similar images.IBM is working with broadcasters CNN and ABC to identify concepts that can be used to classify news footage.
BI Solutions, Products on Tap:
What, specifically, are IBM's deliverables for business intelligence?
We typically lead with our P-series hardware. ... And we will lead with our Fastkey storage. We have a software platform that we've been evolving called Data Warehouse Edition. And underneath the covers of Data Warehouse Edition is DB2; DB2 CubeViews, which is our metadata bridge; there's the Information Integrator, which allows you to get to heterogeneous sources of data; there's the Intelligent Miner capability, which allows you to do scoring and analytics; and there's Warehouse Manager, which is rudimentary ETL [extraction, transformation and loading]. If you don't have ETL today, we'll give you some ETL capabilities. And there's actually a capability called OfficeConnect, which allows you to use spreadsheets as the presentation layer of the Data Warehouse Edition.
.....
How would you say IBM's approach to BI differs from Microsoft [Corp.]'s and Oracle [Corp.]'s approaches to BI?
We don't believe in the one-size-fits-all strategy, which I think differentiates us from our competitors. I also think what differentiates us from our competitors is the fact that we will deliver all the capabilities as an integrated package. We deliver the server, we deliver the storage, we deliver the services. Our competitors can't do that. They're software companies. They must partner.
Nimzo Benoni runs a very technical Oracle weblog. A daily, weekly, and monthly checklist of tasks might be pretty basic for some, but its surprising how many DBAs haven't thought it out. A good reference.
The best and most insightful analytics project in the world can fail without properly understanding the business perspective, and without support from senior decision makers. Intelligent Enterprise Magazine: Metrics Development: Taking It From the Top makes this point well and has some great advice:
IT managers who head up dashboard projects must get their business counterparts heavily involved in KPI development. The more senior the collaborative team, the better. If CEOs or COOs aren't actively involved in KPI development, at a minimum they should sign off on the finished KPI scorecard. The ideal process is to create a senior cross-functional team that touches all major departments; the team should meet regularly until a consensus scorecard emerges.
Several factors increase the likelihood of success. First, going into these meetings, all participants need a clear (and consistent) understanding of the organization's high-level strategies and goals for the short and long term. If these strategies and goals don't exist, end the meeting immediately and inform the executives nearest the top that they have to provide them. There is no substitute. That's why many people strongly believe that executive sponsors are the true key to performance management success.
Second, team success depends on the creation of an open, nonpolitical atmosphere where members can discuss candidly what should and shouldn't be measured. Your own experience has probably shown the difficulty of setting such an atmosphere. Imagine this scenario: The people in the room may ultimately be judged (and potentially compensated) on how they perform against the measures the team is selecting — which is why a third key success factor is frequently obtaining outside help. Running these sessions can be especially difficult for a person inside the company, with the possible exception of the CEO or COO — someone who rarely has the time to head up the team sessions.
INETA Pakistan points to BPEL (Business Process Execution Language), or "Build your Business Apps with BPEL." (requires registration) Written by the director of server technology at Oracle, it's a good overview of how to get started using BPEL when developing applications. Goes over some of the important classes of tools to get started, like XML tools, web service tools, etc. The background for the article is Oracle BPEL, but the princples are, in true cross-platform fashion, applicable to any environment, whether you use Oracle or Microsoft or Apache's opne-source tools.
TerraServer Bricks, A High Availability Cluster Alternative - from Only4Gurus.com is a Microsoft Research paper on Only4Gurus that describes how Microsoft is experimenting with Google-type architecturesto eliminate high-priced SANs and replace them with commodity components used by the TerraServer project. The TerraServer project is known to many SQL Server customers as it is often used as an example of the storage capabilities of SQL Server. The old Single SQL Server node in Active-Passive cluster connected to 18 TB of data, and was enourmously expensive, complicated to administer, took 30 seconds+ to failover, and required a backup tape library (the SAN was too expensive to provide a duplicate copy.)
In many enterprise applications, a SAN’s high cost and complexity can be tolerated because of the ROI the application provides to the organization. However, most internet applications have razor thin profit margins. It is difficult if not impossible to host a profitable internet business on SAN hardware. Yahoo and Google give good examples of this. They buy very low-cost hardware configured redundantly to achieve high availability. They do not depend on system software or hardware components to handle failure cases. Instead, they program “around failures” at the application or in the “middle-ware” that their staff implements. As a result, they have very high application availability implemented and deployed at a very low cost.
In contrast MSN and many Microsoft customers have traditionally deployed SQL Server, and Microsoft clustering applications that expect the underlying hardware and system software to handle failure conditions transparently to the application. This is changing, MSN search has a brick design, and MSN Hotmail is making the transition from expensive backend SAN servers to commodity servers similar in design to TerraServer Bricks.
Three-year TCO has gone from $3.3M to $0.5M – a 6-fold cost reduction.
The TerraServer Brick architecture, server equipment purchased from Silicon Mechanics, and the SATA disk technology has exceeded our expectations in every aspect.
We already knew that SATA disks and white-box PCs could meet the performance requirements because of testing done in October 2003 [Barclay03]. We were
frightened into thinking the failure rate of the SATA disk drives would be 100%. The actual annual failure rate has been 6.4% which is reasonably close to the 5.5% SCSI
disk failure rate. The SATA drives combined with the reliability and performance of the 3Ware RAID controllers are formidable competitors to SAN technology
at a fraction of the cost.
We expected the “white-box” servers to be less reliable and the service to be worse than what we received from Compaq (now HP) on the SAN Cluster. We had a
handful of reliability issues15 and excellent service from Silicon Mechanics that so far is on par with the experience we had with Compaq for over five years.16
We also experienced zero “blue screens” or other unexplained system crash. Actually, we didn’t experience any issues with the system software or hardware that
resulted in a system crash.
...
In summary, we conclude, like Yahoo, Google, MSN Hotmail, and MSN Search, that commodity storage and servers are the price/performance choice for high-volume
web applications. While we loved the TerraServer SAN Cluster and its ability to detect and handle failures transparently, the price, performance, and reliability
benefits of the TerraServer Bricks configuration outweigh the costs of implementing failover and redundancy logic in the application. We expected to find limitations
and missing features in Windows, .NET, and/or SQL Server that high-availability web sites would need to deploy Windows and SQL Server on commodity servers.
We were wrong. Windows 2003, .NET 1.1 and SQL Server 2000 have all the engineering robustness and features required for users to deploy highly-available and
high-volume web applications with little additional investment in application development.