What advice would you give Benjamin Bradock in 2012? The attached infographic from EMC (Click HERE) makes a compelling case for choosing a career as a Data Scientist. The next generation of business leaders will use data very differently. Data is no longer just used to generate reports and guide decision making in financial management and supply chain management. Now the emphasis is to drive predictive strategies, and to guide activities in selling, marketing and operations. Ad-hoc spreadsheets are giving way to sophisticated solutions using machine learning and Big Data. The reliance on analytics is fundamental to the 21st century organization. Data is the new Plastics!
Recently I have been busy creating a series of dashboard reports for one of the clients. As the reports are to be integrated into a custom, in-house developed web portal, the report server needed to be installed in the SharePoint integrated Mode (SharePoint 2010). As the application backend is based on SQL Server 2008 R2 and the core functionality has been developed in .NET, SSRS was easy to integrate using ReportViewer webpart. Majority of the reports use stored procedure to fetch data from tables, which already contain pre-aggregated data (between 300 and 1200 rows and approximately 10 columns).
Most of the reports perform very well, with the execution times ranging from 900 milliseconds to 2 seconds. The exception to this rule consist of a batch of reports which contain visually rich elements e.g. charts and gauges embedded within a matrix. Quick, back-of-the-napkin calculation of the number of images to display (8 x 3 x 7) by two matrix objects, this gives us 336 images to render each time the report is run.
Also, since the images are being created dynamically by SSRS, they are given a unique identifier which is not cached – so the end result may be the same image that is already displayed multiple times. Given the fact that all these are relatively small in size – under 1KB – and that rendering takes advantage of a pretty gutsy backend hardware you can imagine how unimpressed I was when the report took 48 seconds to fully display. Also, majority of those will be run by the client’s customers over WAN spread out all over the country so there is an extra cost associated with traversing the Internet, not just our LAN. Below is a short screen capture footage displaying the report’s unacceptable execution speed.
According to Microsoft, in SP Integrated Mode, there are a lot more WFE API calls, as well as WFE-SSRS API calls. These contribute to the overall rendering time, which we later confirmed by running a trace using WireShark as well as in-browser Google Developer Tools. Below are the screen captures from GDT run in Chrome depicting the size of the images loaded as well as the final call for image after nearly 44 seconds of blocking due to the synchronous execution nature of the transfer.
Running same reports in Native Mode the performance was respectable and easily allied to the client KPIs and SLAs. As far as I’m aware there is no fixes or improvements in this realm released from Microsoft, even though they are aware of the issue. SQL Server 2012 (codename Denali) is supposed to have major improvements in this area but if you installed or recently upgraded to version 2008 (R2), you’re out of luck. There are even some BI departments I ‘m aware of that withheld their upgrades from version 2005 due to the impact this issue would have on their reporting standards.
My name is Martin and this site is a random collection of recipes and reflections about various topics covering information management, data engineering, machine learning, business intelligence and visualisation plus everything else that I fancy to categorise under the 'analytics' umbrella. I'm a native of Poland but since my university days I have lived in Melbourne, Australia and worked as a DBA, developer, data architect, technical lead and team manager. My main interests lie in both, helping clients in technical aspects of information management e.g. data modelling, systems architecture, cloud deployments as well as business-oriented strategies e.g. enterprise data solutions project management, data governance and stewardship, data security and privacy or data monetisation. On the whole, I am very fond of anything closely or remotely related to data and as long as it can be represented as a string of ones and zeros and then analysed and visualised, you've got my attention!
Outside sporadic updates to this site I typically find myself fiddling with data, spending time with my kids or a good book, the gym or watching a good movie while eating Polish sausage with Zubrowka (best served on rocks with apple juice and a lime twist). Please read on and if you find these posts of any interests, don't hesitate to leave me a comment!