In Azure: New HDInsight Capabilities, DocumentDB API for MongoDB Generally Available and More

Find out how to use templates to customize restored VMs from Azure Backup, new series of storage-optimized VMs generally available in 6 regions, SQL grammar expanded in DocumentDB.

Share online:

This week at Strata + Hadoop World in San Jose, new capabilities for both Windows and Linux coming soon to Azure HDInsight and Azure DocumentDB were announced, as well as the next version of SQL Server Community Technology Preview 1.4.

Azure HDInsight is a "fully managed operational support system (OSS) analytics platform for running all open-source analytics workloads at scale, with enterprise grade security and SLA," and Azure DocumentDB is "planet-scale fully-managed NoSQL database service," writes the Azure team.

DocumentDB announcements made today include:

Apache Spark connector for DocumentDB enables enables real-time data science and exploration over globally distributed data in DocumentDB. The Spark to DocumentDB connector uses Azure DocumentDB Java SDK.

General availability of high-fidelity, SLA backed MongoDB APIs for DocumentDB, allows customers to easily move to DocumentDB while continuing to use MongoDB APIs, but get comprehensive enterprise grade SLAs, turn-key global distribution, security, compliance, and a fully managed service.

HDInsight announcements made today include:

Hortonworks Data Platform 2.6 will be continuously available to HDInsight even before its on-premises release.

Making HDInsight for Hadoop more secured with expanded capabilities across other workloads including Interactive Hive (powered by LLAP) and Apache Spark. "This allows customers to use Apache Ranger over these popular workloads to provide a central policy and management portal to author and maintain fine-grained access control," the team said.

In addition, customers can now analyze detailed audit records in the familiar Apache Ranger user interface.

Apache Spark for Azure HDInsight provides 99.9% SLA-backed Spark 2.1, now support real-time streaming solutions with Spark integration to Azure Event Hubs and leveraging structured streaming connector in Kafka for HDInsight.

This will allow customers to use Spark to analyze millions of real-time events ingested into these Azure services, thus enabling IoT and other real-time scenarios.

Real-time data science with Apache Spark and Azure DocumentDB
Real-time data science with Apache Spark and Azure DocumentDB

Enabling Data Warehouse scenarios through interactive Apache Hive 2.1.1, customers can expect sub-second query performance. Interactive Hive clusters also support popular BI tools useful for business analysts who want to run their favorite tools directly on top of Hadoop.

Finally, a new preview for next version of SQL Server Community Technology Preview (CTP) 1.4 for both Windows and Linux coming soon, offers an enhancement to SQL Server v.Next on Linux.

Another enhancement to SQL Server v.Next on Windows and Linux is resumable online index builds b-tree rebuild support which extends flexibility in index maintenance scheduling and recovery.

Our goal is to make big data accessible for everybody. We have designed productivity experiences for different audiences including the data engineer working on ETL jobs with Visual Studio, Eclipse, and IntelliJ support, the data scientists performing experimentation with Microsoft R Server and Jupyter notebook support, and the business analysts creating dashboards with Power BI, Tableau, SAP Lumira, and Qlik support. As part of HDInsight's support for the latest Hortonworks Data Platform 2.6, Zeppelin notebooks, a popular workspace for data scientists, will support both Spark 2.1 and interactive Hive (LLAP). Additionally, we have added popular independent software vendors (ISVs) Dataiku and H20.ai to our existing set of ISV applications that are available on the HDInsight platform. Through the unique design of HDInsight edge nodes, customers can spin up these data science solutions directly on HDInsight clusters, which are integrated and tuned out-of-the-box making it easier for customers to build intelligent applications.

Microsoft's cloud-first strategy has already shown success with customers and analysts, having recently been placed as a leader in the Forrester Big Data Hadoop Cloud Solutions Wave and a Leader in the Gartner Magic Quadrant for Data Management Solutions for Analytics, says the team.

DocumentDB API for MongoDB generally available now allows developers to experience power of the DocumentDB database engine with the comfort of a managed service and the familiarity of the MongoDB SDKs and tools.

Additionally, a suite of new features for improvements in availability, scalability, and usability of the service announced include:

  • Sharded Collections
  • Global Databases
  • Read-only Keys
  • Additional portal metrics

Azure Backup service now allows users to customize the virtual machine (VM) that is created as part of a restore operation through a customizable template to be deployed along with restore disks option, says senior program manager.

The new template will be provided for all non-encrypted standard and premium non-managed disk VMs and a support for encrypted and Managed Disks VMs will be adding in coming release, the team said.

Azure Backup provides three ways to restore from VM backup — by creating a new virtual machine, restoring disks from virtual backup and using them to create a VM, or using instant file recovery from VM backup.

Templates for Azure Backup Resotred VMs
Templates for Azure Backup Resotred VMs

Azure Blob Storage and Managed Disks now supported in Cloud Foundry open source platform that enables developers to extend their workloads from any public or private cloud to Azure.

With Azure Blob Storage, "you can address organizational security and compliance requirements by encrypting your Blob storage," the team said. Azure Managed Disks provide simplified disk management, enhanced scalability and better security.

General availability of StorSimple 8000 series Update 4.0 is available for customers to apply from the StorSimple Manager Service in Azure. Additionally, this update can be applied manually using the hotfix method.

The following new features and enhancements are available today:

  • Heatmap-based restore – No more slowness when accessing data from appliance post device restore (DR). The new feature implemented in Update 4 tracks frequently accessed data to create a heatmap when the device is in use prior to DR. Post DR, it uses the heatmap to automatically restore and rehydrate the data from the cloud.
  • Performance enhancements for locally pinned volumes – This update has improved the performance of locally pinned volumes in scenarios that have high data ingestion.

Microsoft introduced a new series of virtual machine sizes Wednesday – the L Series for Storage, which optimizes workloads that require low latency, and offers up to 32 central processing unit (CPU) cores, using the Intel Xeon processor E5 v3 family, writes Jon Beck.

L Series virtual machines are generally available in these regions: East US 2, West US, Southeast Asia, Canada Central, Canada East and Australia East.

L Series for Azure Storage VMs
L Series for Azure Storage VMs

Azure Site Recovery combines a unique, cloud-first design with simple user experience for a powerful solution that recovers not just virtual machines, but entire applications in case of a disaster.

"With support for single and multi-tier application consistency and near continuous replication, Azure Site Recovery ensures that no matter what application you are running, shrink-wrapped or homegrown, you are assured of a working application when a failover is issued," writes Microsoft principal lead program manager.

Check out the additional product information to start replicating workloads to Microsoft Azure using Azure Site Recovery here.

Azure Site Recovery
Azure Site Recovery

Azure DocumentDB supports querying documents using a familiar structured query language (SQL) such as grammar, now has expanded to support aggregate functions. "Support for aggregates is the most requested feature on the user voice site, so we are thrilled to roll this out," says program manager.

DocumentDB is a fully managed NoSQL database service built for fast and predictable performance, high availability, elastic scaling, global distribution and ease of development.

The following aggregations are added to SQL grammar today:

  • Aggregates for planet scale applications. now perform aggregate queries against data of any scale with low latency and predictable performance.
  • Aggregate support rolled out to all DocumentDB production datacenters or provision new DocumentDB accounts via the SDKs, REST API, or the Azure Portal.
  • Aggregates with SQL now supports the SQL aggregate functions COUNT, MIN, MAX, SUM, and AVG.
  • Query for aggregates with LINQ in addition to SQL with .NET SDK 1.13.0.
  • Aggregates using the Azure Portal
Aggregage Query in Azure DocumentDB
Aggregage Query in Azure DocumentDB

Power BI solution templates now support Azure Analysis Services for actionable insights

Power BI solution templates now support Azure Analysis Services, which can transform complex data into actionable insights.

With Power BI solution templates, users can create compelling analytics and visualizations on a scalable and secure architecture.

"This means that instead of spending weeks or months getting going, you can get started immediately and spend your time on extending and customizing the result to meet your organization's needs," Tkachuk writes.

Microsoft also on Webnesday, announced that the Azure Government is now accessible through the Azure Storage Explorer as of the 0.8.9 release.

With this, Government customers now be able to take advantage of all the latest features of the Azure Storage Explorer such as being able to create and manage blobs, queues, tables, and file shares. Now, when the Azure Storage Explorer desktop application is opened, it'll ask you how you want to connect to your storage account.

The newest version now provide an option to add an "Azure US Government" account:

Access Azure Government through Storage Explorer
Access Azure Government through Storage Explorer

The Department of Veterans Affairs (VA), has issued a FedRAMP High Authority to Operate (ATO) for Microsoft Azure Government. In addition to an Agency FedRAMP High ATO from VA, Azure Government has also received a FedRAMP High P-ATO from the Joint Authorization Board (JAB), writes the Azure team.

Microsoft Azure Government leads the industry with 32 FedRAMP-approved services spanning both infrastructure-as-a-service and platform-as-a-service offerings.

A complete list of Azure services covered under the Azure Government FedRAMP High ATO can be found by visiting the Microsoft Trust Center.

The Azure Government High ATO will also allow VA to build solutions using true platform-as-a-service and advanced service components including:

  • Azure Web Apps
  • Application Gateway
  • Azure Active Directory
  • Media Services
  • Power BI
  • Azure SQL Database
  • Redis Cache

System Center Management Pack for SQL Server and Dashboards (6.7.20.0) released today for download:

A Bug Bounty Program for Office Insiders, offering up to $15,000 in payouts on Wednesday announced.

"Payouts can be up to $15,000, but it varies and can be as little as $500. Elevation of privilege in Protected Mode can be between $9,000 and $15,000, depending on the report quality. Macro execution pays out the same, but bypassing the security features in Outlook pays between $6,000 and $9,000."

Here is how the program works:

  • Types of vulnerabilities awarded and their details are listed in the Microsoft Office Insider Builds on Windows Bounty Program Terms, including:
  • Elevation of privilege via Office Protected View
  • Macro execution by bypassing security policies to block macros
  • Code execution by bypassing Outlook automatic attachment block policies
  • The program duration is for three months from March 15 to June 15, 2017
  • Bounty payout ranges during this period will be $6,000 to $15,000 USD
  • Call to action: send your vulnerabilities to secure@microsoft.com

    See the following list that may disqualify a submission:

    • Vulnerabilities in anything earlier than the current Office Insider slow build on Windows Desktop
    • Vulnerabilities in user-generated content
    • Vulnerabilities requiring extensive or unlikely user actions
    • Vulnerabilities found by disabling existing security features
    • Vulnerabilities in components not installed by Office
    • Vulnerabilities in third party components that might be installed on the system that enable the vulnerability
    • Vulnerabilities about escaping Protected View where Protected View is explicitly not activated in Office code or enabled by default for the reported scenario.
    • Vulnerabilities in the Application container
    • Any other category of vulnerability that Microsoft determines to be ineligible, in its sole discretion.

    Latest version of Team Foundation Server 2015 Update 4 RC announced on Wednesday, contains only bug fixes - numbering almost 25 in total:

    • The @Today and @Me macros do not work correctly in non-English in the Kanban board card style rules.
    • The inline add card experience on the Kanban board does not work correctly. For example, the Title field cannot be edited.
    • If a user switches between two work items of the same type on the queries page before the HTML fields finish loading, the HTML field may become empty and the work item will become dirty.
    • The Batch API, such as WorkItemStore.GetWorkItemIdsForArtifactUris(), may return incorrect results when called with many strings.
    • When a customer has rules in the global workflow and tries to move them to a work item type definition, there will be an error, "TF237090: Does not exist or access is denied".
    • If a TFS instance has a collection with a space in the name and has a public URL that is different from the internal URL, inline images may be missing in work items when opened by another user.
    • Work item tracking warehouse sync fails with a name conflict when field names only differ by a space replacing a "." or "_" (i.e. "My Field" and "My_Field").
    • Work item tracking warehouse sync fails when a work item has a link comment that contains special characters, such as 0x0B.