Windows Azure Blog
Microsoft Cloud Computing Platform
Microsoft Store
  • Home
  • Windows Azure Team Blog
You are here : Windows Azure Blog » OakLeaf Systems » Ted Kummert at PASS Summit: Hadoop-based Services for Windows Azure CTP to Release by End of 2011

Ted Kummert at PASS Summit: Hadoop-based Services for Windows Azure CTP to Release by End of 2011

Posted On Wednesday, October 12, 2011 By rss. Under OakLeaf Systems    

Ted Kummert announced on 10/12/2011 in his PASS Summit 2011 keynote a partnership with Hortonworks to port Apache Hadoop to SQL Azure by the end of 2011. From the Microsoft Expands Data Platform With SQL Server 2012, New Investments for Managing Any Data, Any Size, Anywhere press release of the same date:

Microsoft is committed to helping customers manage any data, any size, anywhere with the SQL Server data platform, Windows Server and Windows Azure. Hortonworks has a rich history in leading the design and development of Apache Hadoop. Their experience and expertise in this space helps us accelerate our delivery of our Hadoop based distribution on Windows Server and Windows Azure while maintaining compatibility and interoperability with the broader ecosystem.

Ted posted Microsoft Expands Data Platform to Help Customers Manage the ‘New Currency of the Cloud’ at 9:00 AM:

imageThis morning, I gave a keynote at the PASS Summit 2011 here in Seattle, a gathering of about 4,000 IT professionals and developers worldwide. I talked about Microsoft’s roadmap for helping customers manage and analyze any data, of any size, anywhere — on premises, and in the private or public cloud.

Microsoft makes this possible through SQL Server 2012 and through new investments to help customers manage ‘big data’, including an Apache Hadoop-based distribution for Windows Server and Windows Azure and a strategic partnership with Hortonworks. Our announcements today highlight how we enable our customers to take advantage of the cloud to better manage the ‘currency’ of their data.

We often talk about the economics of the cloud, detailing how customers can achieve unmatched economies of scale by taking advantage of public or private cloud architectures. As an example, an enterprise with a small incubation project could theoretically take it to production overnight, thanks to the elasticity and scalability benefits of the cloud.

As we turn more and more to the cloud, data becomes its currency. The exchange of data is the heart of all cloud transactions, and, as in a real-world economy, more value is created whenever data is generated or consumed. But there are new business challenges that this currency creates: How do we deal with the scope and scale of the data we manage? How do we deal with the diversity of types and sources of data? How do we most efficiently process and gain insight from datasets ranging from megabytes to petabytes?

How do we bring the world’s data to bear on the tasks of the enterprise, as businesses ask themselves questions like: “What can data from social media sites tell me about the sentiment of my brands and products?” And, how do we enable all end-users to gain the critical business insights they need – no matter where they are and what device they are using? Customers need a data platform that fully embraces the cloud, the diversity and scale of data both inside and outside of their ‘firewall’ and gives all end-users a way to translate data into insights – wherever they are.

Microsoft has a rich, decades-long legacy in helping customers get more value from their data. Beginning with OLAP Services in SQL Server 7, and extending to SQL Server 2012 features that span beyond relational data, we have a solid foundation for customers to take advantage of today. The new addition of an Apache Hadoop-based distribution for Windows Azure and Windows Server is the next building block, seamlessly connecting all data sizes and types. Coupled with our new investments in mobile business intelligence, and the expansion of our data ecosystem, we are advancing data management in a whole new way. …

Read more.

image_thumb[37]Ted introduced Hortonworks’ Eric Baldeschwieler who reported “Yahoo now has 40,000 computers running Apache Hadoop”, “Over 80 percent of new data being generated is from unstructured sources” and “Hadoop could be storing half the world’s data within five years.”

Kummert said a Community Technology Preview (CTP) of the Hadoop-based service for Windows Azure will be available by the end of 2011, and a CTP of the Hadoop-based service for Windows Server will follow in 2012.

Denny Lee demonstrated a HiveQL query against log data in a Hadoop for Windows database with a HiveODBC driver that Ted Kummert said will be available as a CTP next month (November 2011):

image_thumb[14]

Denny’s Revelations – rolling the hard six to SQL BI and Hadoop post of 11/12/2011 provides more information on Apache Hadoop in SQL Azure and SQL Server:

Okay! With today’s Ted Kummert’s Day 1 Keynote of the SQL Server PASS Summit 2011, I had the honor of demonstrating how SQL BI and Hadoop rock together! As you can see from the Port 25 Microsoft, Hadoop, and Big Data and the Microsoft News Center for SQL Server 2012 posts there are a number of cool things that are happening:

  • It started with the Hadoop connectors for SQL Server and PDW. Key call out here is that these connectors are bi-directional to allow data movement back and forth between SQL Server and Hadoop.
  • imageWindows Server and Windows Azure optimized Hadoop distributions; out of the box (or cloud), the distributions includes support for HDFS, Hive, Pig-Latin, FTP, etc.
  • Our partnership with Hortonworks to help us push forward faster with optimizing Hadoop to run on Windows as noted in their post Bringing Apache Hadoop to Windows.
  • As part of the demo today, I showed the integration of the SQL BI stack with Hadoop by having PowerPivot (for Excel and SharePoint) interact with Hadoop for Windows cluster via Hive and the soon to be released HiveODBC driver.
  • Not shown today, but just as cool will be the release of the Excel Hive Add-in

More information will be posted at www.microsoft.com/bigdata as it becomes available, eh?!

Cool, so why did I use “embrace Hadoop”?

A key call out during my conversation with Ted during the keynote is that our offering is 100% compatible with Apache Hadoop – if your code works on Apache Hadoop then it will work on ours and vice versa. But, it’s not just about the code, it’s also about this shift that we are embracing the open source community!

For example, one of the key demos that I have shown is the ability to write Map Reduce jobs in JavaScript (as opposed to Java). This is what I would like to call:

Our VB moment in Big Data

That is, we had made Visual Basic a powerful language for developers and with .NET opened the door for these developers to go into the enterprise. By making JavaScript a first class language for Big Data, we are helping to enable the millions of JavaScript developers to enter the realm of Big Data. Even more awesome is the JavaScript on Hadoop, an example of one of our proposals back to the Apache Hadoop community.

So why is Big Data / Hadoop important for a BI dude or dudette?

I’ll probably have a number of posts to for this question alone, but let me give you one answer right now – this is an excerpt from my post: “Hadoop: A movement, not just a technology”

Why am I excited about Hadoop and Big Data even though I’m a Microsoft BI person for most of my career? Because first and foremost, BI is all about making sense of the information. And the greatness of Big Data isn’t just about exploring, understanding, and asking even more questions of this information, but doing it in distribution (vs. silos) and putting more emphasis on the data (i.e. this is where the real IP is)

Any other cool information on Big Data at SQLPASS this week?

Both Ted Kummert and David DeWitt’s keynotes will cover Big Data. If you cannot attend, check out the SQL Server PASS Summit 2011 Live Streaming. As well, there are two breakout sessions on Big Data, both on Thursday:

  • AD-216-M: Overview of Big Data on Windows and Windows Azure by Saptak Sen
  • BIA-408-A: SQLCAT: Tier-1 BI in the world of Big Data by Thomas Kejser and myself – with special guest Kenneth Lieu from Yahoo!

Also don’t forget that I will be hosting the Big Data table at the Birds of Feather luncheon and a bunch of us will be floating around the Big Data Kiosk in the product pavilion.

Whew! I think that’s it for today!

For more details about Big Data in the Cloud, see my Choosing a cloud data store for big data (June 2011) and Microsoft’s, Google’s big data [analytics] plans give IT an edge and links to Resources (August 2011) for SearchCloudComputing.com.


Technorati Tags: Windows Azure,Apache Hadoop,Hadoop for Windows Azure,Hadoop for Windows,Hive,HiveQL,HiveODBC,Hortonworks,Yahoo,Cloud Computing Futures,Big Data,Big Data Analytics

http://oakleafblog.blogspot.com/2011/10/ted-kummert-at-pass-summit-ctp-for.html

Share this:

  • Print
  • Email
  • Facebook
  • Twitter
  • Digg
  • Reddit
  • StumbleUpon
« Cross-Post: Microsoft Announces Big Data Roadmap, Adopts Apache Hadoop on Windows Azure
Ted Kummert at PASS Summit: “Data Explorer” Creates Mashups from Big Data, DataMarket and Excel Sources »
  • Categories
    • AppFabric Team Blog (13)
    • Channel 9 (413)
    • cloud development blog (42)
    • Cloudy in Seattle (10)
    • Convective (10)
    • Matias Woloski (15)
    • Nick Harris .NET – Enterprise Development with Azure, ASP .NET MVC and Windows Phone 7 (49)
    • OakLeaf Systems (419)
    • Scott Hanselman's Blog (20)
    • ScottGu (16)
    • SQL Azure Team Blog (29)
    • Stack Overflow Azure (7106)
    • Uncategorized (16)
    • Wade Wegner (19)
    • Windows Azure Developer Tools Team (25)
    • Windows Azure Marketplace DataMarket Blog (26)
    • Windows Azure Storage Team Blog (62)
    • Windows Azure Team Blog (550)
    • Windows Phone Developer Blog (56)
    • Zane Adam's blog (22)
  • Translator
    English flagItalian flagKorean flagChinese (Simplified) flagChinese (Traditional) flagPortuguese flagGerman flagFrench flagSpanish flagJapanese flagArabic flagRussian flagGreek flagDutch flagBulgarian flagCzech flagCroatian flagDanish flagFinnish flagHindi flagPolish flagRomanian flagSwedish flagNorwegian flagCatalan flagFilipino flagHebrew flagIndonesian flagLatvian flagLithuanian flagSerbian flagSlovak flagSlovenian flagUkrainian flagVietnamese flagAlbanian flagEstonian flagGalician flagMaltese flagThai flagTurkish flagHungarian flagBelarus flagIrish flagIcelandic flagMacedonian flagMalay flagPersian flag
  • Recent Posts
    • There is already an object named 'UserProfile' in the database after deploy
    • Uploading Large Amounts of Data from C# Windows Service to Azure Blobs
    • Does any body face issue with microsoft azure CDN recently?
    • Azure Dashboard – You have not configured a web endpoint for monitoring.
    • Windows Azure website suspended due to exceeding CPU quota
  • Advertisements

  • RSS

    Windows Azure Blog

  • Twitter
  • Categories
    AppFabric Team Blog Channel 9 cloud development blog Cloudy in Seattle Convective Matias Woloski Nick Harris .NET - Enterprise Development with Azure, ASP .NET MVC and Windows Phone 7 OakLeaf Systems ScottGu Scott Hanselman's Blog SQL Azure Team Blog Stack Overflow Azure Uncategorized Wade Wegner Windows Azure Developer Tools Team Windows Azure Marketplace DataMarket Blog Windows Azure Storage Team Blog Windows Azure Team Blog Windows Phone Developer Blog Zane Adam's blog
Mocell WordPress Theme By MagPress.com
Thanks to Cat Lovers | Meet Locals | Florida Chat
Copyright © 2013. All Rights Reserved.
loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.