Archive for February, 2011

Hadoop Aims for the Enterprise

Hadoop, the data storage and retrieval approach developed by Google to handle its massive data needs, is coming to the enterprise data center. Are you interested?

Behind Hadoop is MapReduce, a programming model and software framework that enables the creation of applications able to rapidly process vast amounts of data in parallel on large clusters of compute nodes. Hadoop is an open source project of the Apache Software Foundation and can be found here.

Specifically, Hadoop offers a framework for running applications on large clusters built from commodity hardware. It uses a style of processing called Map/Reduce, which, as Apache explains it, divides an application into many small fragments of work, each of which may be executed on any node in the cluster. A key part of Hadoop is the Hadoop Distributed File System (HDFS), which reliably stores very large files across nodes in the cluster. Both Map/Reduce and HDFS are designed so that node failures are automatically handled by the framework. Hadoop nodes consist of a server with storage.

Hadoop moves computation to the data itself. Computation consists of a map phase, which produces a sorted key and value pairs, and a reduce phase. According to IBM, a distributor of Hadoop, data is initially processed by map functions, which run in parallel across the cluster. The reduce phase aggregates and reduces the map results and completes the job.

HDFS breaks stored data into large blocks and replicates it across the cluster, providing highly available parallel processing and redundancy for both the data and the jobs. Hadoop distributions provide a set of base class libraries for writing Map/Reduce jobs and interacting with HDFS.

The attraction of Hadoop is its ability to find and retrieve data fast from vast unstructured volumes and its resilience. Hadoop, or some variation of it, is critical for massive websites like Google, Facebook, Yahoo and others. It also is a component is IBM’s Watson. But where would Hadoop play in the enterprise?

Cloudera (www.cloudera.com) has staked out its position as a provider of Apache Hadoop for the enterprise. It primarily targets companies in financial services, Web, telecommunications, and government with Cloudera Enterprise. It includes the tools, platform, and services necessary to use Hadoop in an enterprise production environment, ideally within what amounts to a private cloud.

But there are other players plying the enterprise Hadoop waters. IBM offers its own Hadoop distribution. So does Yahoo. You also can get it directly from the Hadoop Apache community.

So what are those enterprise Hadoop applications likely to be. A few come immediately to mind:

  • Large scale analytics
  • Processing of massive amounts of sensor or surveillance data
  • Private clouds running social media-like applications
  • Fraud applications that must analyze massive amounts of dynamic data fast

Hadoop is like other new technologies that emerge. Did your organization know what it might do with the Web, rich media, solid state disk, or the cloud when they first appeared? Not likely, but it probably knows now. It will be the same with Hadoop.

 

Advertisements

, , , , , ,

Leave a comment

The Internet of Things

The Internet of things has the potential to change our businesses and our lives as much as or possibly more than today’s Internet. It has been a long time in coming, maybe since the advent of bar codes but certainly since the development of RFID tags.

Among the recognized thought leaders is McKinsey. You can check out a piece they posted last March here IBM has a five-minute video that introduces it here.

The Internet of things is another aspect of the digital transformation of the world. IBM has given it the Smarter Planet label. Others call it the global digital nervous system.  It is the collection of devices, phones, computers, sensors, and more that are continuously capturing and communicating digitized information. And once that information is digitized we can begin to do something with it. What kind of information do you need to advance your business objectives?

When IBM talks about the Smarter Planet it is talking about the Internet of Things. IBM sees it as the intelligence being infused into the systems and processes that make the world work—into things no one would recognize as computers: cars, appliances, roadways, power grids, clothes, even natural systems such as agriculture and waterways.

Would your business like to know how people actually use your products? It might change the way you create, design, build, and market. Of course you could approximate some of this information through focus groups, but they are costly and imperfect. Sensors built into your products and communicating back to you about how they are actually being used would give you the real story.

RFID (Radio Frequency ID) is steadily altering the supply chain. The time when every consumer product has an RFID tag remains some time away, but the technology is being widely adopted in the back room, the back lot, on the shipping dock, and more. That’s the Internet of Things.

Smartphones, WiFi, wireless communications of all types are fueling the Internet of Things. The number of smartphone users soon will be in the hundreds of millions. Each smartphone can be a sensor on the Internet of Things.

Pretty soon, for example, people will use smartphones to purchase a can of soda from a vending machine; that’s the Internet of Things. These phones will be generating presence sensing data, GPS data, motion data, transaction data, and more. Would your marketing department like to know when someone walks into a place selling your product? Better yet, what if they pick up your product and then start to put it down! (Remember, some smartphones sense motion and direction.)

The Internet of Things gets exponentially bigger when digitized surveillance data is added to it. Think of the various CSI and NCSI cop shows where the good guys grab digital video from various surveillance cameras and combine it with blueprints of buildings and schematics. That is Hollywood make-believe today but clearly points to the Internet of Things, a digitally transformed world where vast information is sensed, metered, captured, communicated, and could be available at the click of a button.

Even before then the demand for IP addresses is pushing the capability of today’s Internet to accommodate  new addresses. To meet what will is shaping up as insatiable demand for IP addresses, the Internet will shortly be adopting IPv6.  That should take care of IP  addresses for the rest of any of our lifetimes and beyond.

What will be needed to succeed in that world is excellent analytics—fast real-time analytics that grab the right information and spew out accurate analyses fast. Are you prepared?

 

, , ,

2 Comments

Management of Private Clouds

Private clouds are attracting serious attention because they provide many of the benefits of cloud computing, but are deployed behind the corporate firewall. They eliminate security concerns around public clouds, and ensure the company retains full control—the two biggest obstacles companies raise about cloud computing. In one Gartner survey 43% of respondents increased private cloud spending.  Private clouds, however, need to be tightly managed.

Cloud computing represents the convergence of several IT trends: virtualization, distributed application design, grid, and enterprise IT management, explains Elber Riberio, Director of Global IT Outsourcing at CPM Braxis Capgemini, a leading remote infrastructure management outsourcing firm. Business executives like the idea of private clouds but often overlook the critical IT management demands.

In the private cloud, IT capabilities, applications and data are delivered over the internal network as services. The IT services and the corresponding virtualized IT resources can be allocated, provisioned, and configured fast; in minutes or hours rather than in days or weeks. This makes private clouds particularly responsive to changes in the business.

The cloud only masks the underlying complexity of IT, not eliminates it. Private clouds are more demanding from an IT management perspective as more users turn to it to do more things. For that reason private clouds need more and better management, not less.

If anything, a private cloud will stress the organization’s IT management as never before. Due to the ease of accessibility and configurability workers will use more and different IT services in more ways and more often. Since it will be easier and cheaper to provision virtual servers business units will ask for more as they respond to changing opportunities.

The same goes for security. With more users doing more and different things organizations must be particularly vigilant in monitoring what is happening to ensure users are doing only what they are authorized to do.

These increased management and security challenges alone are enough to lead organizations to opt for some form of expert remote infrastructure management outsourcing (RIMO). Unlike traditional IT outsourcing RIMO does not entail the transfer of an enterprise’s assets or personnel to the outsourcing provider. Instead, the organization contracts for the provider to continuously monitor its systems, identify potential problems, handle configuration remotely, and resolve problems in real time from a remote location.

The same offshore India-based outsourcing companies also do RIMO. For many organizations RIMO turns out to be a cost-effective way to transition to a private cloud and improve their IT management without having to hire on-site staff.

But competitive RIMO providers now are cropping up much closer. CPM Braxis Capgemini offers a Brazilian near-shore RIMO alternative for North American companies compared to the far-shore options in India. Due to its recent acquisition by Capgemini, the vendor will likely offer an onshore RIMO option too at some point.

It’s easy to get excited by the possibilities of a private cloud; just don’t ignore infrastructure management.

 

, , , , ,

4 Comments

Turn IT Rejects into Gold

Every company has, depending on its size, a closet, room, floor, basement, shed, or warehouse where they put electronic equipment no longer being used or wanted. What seemingly is IT trash—mainly old PCs and servers, but also storage devices, tape libraries, fax machines, copiers, even cell phones—has real cash value.

GlaxoSmithKline gathered up all that equipment and instead of sending it to a landfill (which is not legal—much of that stuff contains hazardous material that must be disposed of properly and documented) it sent it to PlanITROI, an IT asset disposition company, which refurbished and resold the equipment and netted Glaxo $1.8 million from the effort. Check out the case study here.

IT Asset Disposition (ITAD) has turned into enough of an industry that Gartner puts out an ITAD Magic Quadrant report. According to Gartner, ITAD is growing in importance as ever greater volumes of IT equipment enter their end-of-life and companies realize they can trade this disposal headache into real money.

Many companies, including most IT vendors and leasing companies, are forced into the ITAD business by necessity. Gartner’s ITAD report identifies over a dozen companies making a business of it. They include big hardware vendors like IBM, HP, and Dell and firms like Redemtech and PlanITROI (pronounced Planet ROI).

Getting rid of IT hardware is not as simple as it may seem. As noted above, there are hazardous materials involved that must be handled in compliance with various regulations. Gartner points out that the government is getting increasingly strict about the rules, and there are serious penalties for non-compliance.

But there also is cash to be recouped. “Recycle is the last thing we want to do. We really rather refurbish and resell the asset,” explains Andrew Bauer, CFO at PlanITROI. The company will take almost any electronic equipment. Your workers may expect a new laptop every three years, but there are plenty of organizations that will gladly pay less for a perfectly functional older refurbished system. About the only thing that they can’t refurbish and resell are CRT monitors; those have no value and must be properly disposed.

PlanITROI splits whatever it makes from reselling your unwanted equipment. “Our clients usually get something north of 50%,” says Bauer. As Glaxo found, that can add up to real money.

Laptop and notebook computers command the best resale prices, as much as $300 each. Similarly, name brand equipment brings better prices.

Before getting rid of any equipment, however, make sure all data has been eliminated. Cleaning data is not as simple as just hitting the delete key. It takes a minimum of three delete passes. The Dept. of Defense specifies at least seven delete passes. A thorough data cleaning effort requires checking everything from the BIOS and cache memory to any accompanying media. Data also may be lurking in fax machine, printer, and cell phone memories. ITAD vendors usually handle data cleaning as part of the service.

So, the next time you see old IT equipment junking up your offices, think about converting it to gold.

, , ,

Leave a comment