Microsoft codenamed "Cloud Numerics" Lab from SQL Azure Labs has been refreshed.
For those not aware, "The "Cloud Numerics" lab is a numerical and data analytics library for data scientists, quantitative analysts, and others who write C# applications in Visual Studio. It enables these applications to be scaled out, deployed, and run on Windows Azure. Sign up, use it, and let us know what you think."
The of the Cloud Numerics Lab now include:
- Improved user experience: through more actionable exception messages, a refactoring of the probability distribution function APIs, and better and more actionable feedback in the deployment utility. In addition the deployment process time has decreased and the installer supports installation on a on-premises Windows HPC Cluster. All up, this refresh provides for a more efficient way of writing and deploying Cloud Numerics applications to Windows AzureTM.
- More scale-out enabled functions: more algorithms are enabled to work on distributed arrays. This significantly increases the breadth and depth of big data algorithms that can be developed using Cloud Numerics. Scale-out functionality was added in the following areas: Fourier Transforms, Linear Algebra, Descriptive Statistics, Pattern Recognition, Random Sampling, Similarity Measures, Set Operations, and Matrix Math.
- Array indexing and manipulation: a large part of any data analytics application concerns handling and preparing data to be in the right shape and have the right content. With this refresh Cloud Numerics adds advanced array indexing enabling users to easily and efficiently set and extract subsets of arrays and to apply boolean filters.
- Sparse data structures and algorithms: much of the real-world big data sets are sparse, i.e., not every field in a table has a value. With this refresh of the lab we introduce a distributed sparse matrix structure to hold these datasets and introduce core sparse linear algebra functions enabling scenarios such as document classification, collaborative filtering, etc.
- Apply/Sweep framework: in addition to the built-in parallelism the Cloud Numerics Lab, this refresh now exposes a set of APIs to enable embarrassingly parallel patterns. The Apply framework enables applying arbitrary serializable .NET code to each element of an array or to each row or column of an array. The framework also provides a set of expert level interfaces to define arbitrary array splits. The Sweep framework performs as its name implies - this framework enables distributed parameter sweeps across a set of nodes allowing for better execution times.
- Improved IO functionality: added more parallel readers to enable out of the box data ingress from Windows Azure storage and introduced parallel writers.
- Documentation: introduced detailed mathematical descriptions of more than half of the algorithms using print-quality formulae and best-of-web equation rendering that help clarify algorithm mathematical definition and implementation detail. In addition we added to the "Getting Started" wiki, we added conceptual documentation for the Cloud Numerics help, including the programming model, the new apply framework, IO, and so on.
Microsoft says, it will be distributing a F# add-in for Cloud Numerics soon. "The add-in exposes the Cloud Numerics APIs in a more functional manner, introduces operators, such as matrix multiply, and F# style constructors for and indexing on Cloud Numerics arrays," explains Microsoft.
You will also need a Windows Azure subscription to deploy "Cloud Numerics" application to Azure. If you don't have one, sign up for a free trial here.