Anthony Lopez

“I’d rather see a sermon than hear one any day;”

Archive for the ‘Velocity 2008’ Category

Velocity 2008 – some informal notes

Posted by lopeza on September 3, 2008

This post is a little late but better late than never.  Its a few notes from the velocity conference.  They are in no way organized or complete. 

Green Data centers
Bill Coleman (built Solaris)
Build infrastructure by policy and use cloud computing.
Recommends taking the step into cloud computing to make use of less critical systems at different times.
Don’t be afraid to run mission critical systems, dev systems, QA systems, and test systems on the same infrastructure managed by policy.

Keynote Systems Launches KITE
Vik Chaudhary (Keynote Systems, Inc.), Abelardo Gonzalez (Keynote Systems)
-Online testing of a website via 5 different locations.
-you can decide the action start and stop so you can graph at greater granularity.
-Product is free
-Offers other benchmark tools for free

Jiffy: Open Source Performance Measurement and Instrumentation Performance
Scott Ruthfield (WhitePages.com)
– Real world performance metrics
– Performance lesson
– slow is bad
– ad networks give you adds from sub ad networks…sub ad networks…etc.
– Jiffy small unit of time ; tick between system clock interrupts
– ability to measure anything
– measure the time it takes before the first thing is written to the users screen.
– real or near time reporting.
– no impact on page performance
– Jiffy has many reporting tools
– put code on your page that will measure the start and end of what you want to measure.
– immediate submits on measures or batch them after.
– firebug plug-in
– code.whitepages.com – project link + slides
– What is Gomez data?

Artur Bergman
Wikia – Keynote
Operations:
– execute a repeatable process (bad operations wastes money)
– efficient use of resources
Business
– cost per page view
– cost per page
Wikia problems
– 20% of wikia pages
– 200ms -> 14s to load
– 35% reduction of page views because of slow pages
– use ganglia for apache performance monitoring
Perception
– Ads are slow
– load ads after the content

Sun Speaker
Discussing hardware
Map of web 2.0
– internet and clients
– reverse proxies
– web app servers and cache servers
– database and storage nodes
Web 2- kit
– tools to test your app for horizontally scalable architecture.
Compute
-more threads, ,greater efficiency, lower energy consumption, less heat
– cores and threads are on the move amd and intel going to 8 core and ultra sparc going to 16 core
– App perf and memory capacity
– sun strategy is to build infrastructure that will make open source better and engaging the developers to starting thinking of how to use the infrastructure.
– solid state storage coming soon
– 15k 146gb 180 write iops 320 read iops $2.4/iop
– ssd .08/iop
– zfs file system
****5 15k disks run 3 times slower than 2 32GB ssd and 5 4.2k disks.
Energy Efficient Operations: Some Challenges and Opportunities

Luiz Barroso (Google)
Measuring computing energy efficiency
– harder for computers than for refrigerators
-efficiency = work done/energy used = computing speed/power
Datacenter energy efficiently
Underutilized datacenters
– wasted power
– makes cooling and power distribution less efficient
Server energy efficiency
Plan for today
Datacenter provisioning efficiency
– energy costs of the facility are important
– use most of your capacity most of the time
– Measure your actual power usage – use amp displayed pdu’s
Conclusion
– write good/fast code (software engineer’s biggest contribution to energy efficiency)
– consider reduction of all energy-related costs (electricity and datacenter provisioning)
– Demeaned energy efficiency hardware
– Google is investing in renewable energy
– rechargeit.org

Cloudstatus.com
Clouds Are No Substitute for Competence
Javier Soltero (Hyperic, Inc.)
Clouds is no substitute for competence.
Apps are selectively using cloud services
Hard to determine the problem in your app when consuming 3rd party services.
What happens when there is a problem?
– cloudstatus.com is trying to provide more visibility into clouds problems and stats.
– service health and performance
Demo
– starting and stopping images and provided the measure over time
– put and get s3 data and graphing the measure over time
– sqs
– simpledb

Everything You Ever Wanted to Know About CDNs (But Were Afraid to Ask)
Jacob Rosenberg (AOL), Michael Gordon (Limelight Networks), Keith Oslakovic (Akamai Technologies), Laird Popkin (Pando Networks), Patrick Harr (Nirvanix)
– cdns can be considered the first kind of computing cloud.
– abstraction of technology is what the panel believes the challenge is for all companies.
– clouds are seen as great tools for startups, but the panel agrees that a business decision will occur as to when to bring services in house.
– This talk provided a fairly solid system of dealing with incidents. There are certainly parts that can be implemented on all scales of incident management in IT.
– Large scale organizations have tried to build their own cdn’s but quickly realize that they cost benefit analysis proves that outsourcing to a dedicated cdn is a better choice. Scaling and deliverability are the main reasons.
– how do cdns compete with the compression of cdn pricing and delivery costs.
– big cdns will try to continue to provide value for the service and product they provide.
– Personally I don’t think they answered the real question.

Actionable Logging for Smoother Operation and Faster Recovery
Mandi Walls (AOL)
– Logging goals
– diagnosis and recovery
– statistics and monitoring
– provide insight into the behavior of the app
– indicate potential issues, and arias for improvement
– not the same goals as dev and qa environments
Types of Logs
– access log
– server log, i.e. catalina.out
– app logs
– special use logs for recording specific groups of activities
Log file management
– everyone has their own method
– roll logs into files with timestamps
Log quality info
– logs should be expressive but not overly verbose
– keys to making logs more actionable
– timestamps that mean something and give context for linking to external events like – – network outages or traffic anomalies

Stress, Load, and Performance Testing in Quality Assurance
Goranka Bjedov (Google)
– Using a small test environment will help you fix 80% of the problems in your application. You don’t always need the exact same infrastructure to load test and benchmark.
– use open source
– create reports
– never guarantee that performance will match the tests at all times. (too many variables)
Incident Command for IT: What We Can Learn from the Fire Department
Brent Chapman (Great Circle Associates, Inc.)
– who manages emergencies on a daily basis and what can we learn
– command section
– overall management of the incident
– operations section
– develop and execute plans to achieve objectives set by command
Planning section
– collects and evals info needed to prep actions plan
– forecasts probably course of incident
– plans for next day, etc
– keeps track of status
logistics section
– responsible for obtaining all resources, services, and support to deal with incident.
admin/finance section
– track incident related costs

Getting into the Cloud(s)
Jesse Robbins (O’Reilly Radar), Ezra Zygmuntowicz (EngineYard), Jeff Barr (Amazon Web Services), Jason Hoffman (Joyent, Inc.), Peter Nickolov (3TERA), Jonathan Bryce (Mosso, a division of Rackspace), Paul Colton (Aptana, Inc.)

– don’t get caught up on the clouds physical hardware. you have to be abstract and ignore the details…basically trust that the cloud provides the procs/mem/disk as advertised.
– use a provider using the cloud to provide technology ops and hardware to build their infrastructure for them.
– says there are many different ways to use or build your cloud. Understand what will work for you specific app.
do the cloud providers influence a developers applications?
– admits that they offer many suggestions on scaling and architecting for scale.
– entire panel was entirely open to answering question on how they operate. It was nice to see their transparency.

LinkedIn Communication Architecture
Sean Dawson (LinkedIn), Ruslan Belkin (LinkedIn)

Performance Metrics panel
Eric at aol, Netforecast, Eric shurman, Vic

Day 2
EUCALYPTUS – Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems
Rich Wolski (University of California, Santa Barbara (UCSB))
– elastic utility computing architecture linking your programs to useful systems
– goals are to foster research in elastic cloud computing. experimentation vehicle prior to buying commercial services, provide a debugging and development platform for ec2, provide a basic software development platform for the open source community, and not a designed as a replacement technology for ec2 or any other cloud service
Challenges
– simple architecture and open ape’s
– client side interface
– networking
– security
– packaging installation and maintenance
Architecture diagram, VDE
– used to build virtual Ethernets
– lose about 30% of your network bandwidth.

Harold from Akamai
Caching dynamic content that can’t be cached
– they have their own protocol to eliminated the round trip of packets.
– think of things that don’t have to go to the origin but can be done on the edge. like colors on a car site.

HTTP watch
– free version
Fiddler
– proxy to debug your website
– enables https interceptions via a self-signed certificate
– runs as a local proxy
Fiddler for performance
– measure request size, page weight
-analyze caching, compression, page composition
-simulate low speed/ high-latency connections
Performance statistics include summaries.
Includes session time line
Filters
-allows you to pear down your traffic.
-break point debugging
Custom rules
Traffic Modification
-redirect requests to a particular datacenter
-simulate a downed server
Traffic archives
Fiddlercap.com
– used to send the web data to techs for debugging.
Eric Goldsmith from AOL demoing a page test
– plug in for IE to give you page load times
– free
– download from source forge.
– There is an online hosted version
-simulates different connection speeds.
webpagetest.org

Profiling Dynamic Web Applications with Firebug
John J. Barton (IBM)
Profiling software installed on Firefox
– its really easy
– free
– makes JavaScript profiling easy
Storage at Scale Sean Quinlan (Google, Inc.)

storage systems at Google
– GFS – Google file system
– currently scales to 1000s of servers and petabytes of data
– gfs architecture (look further into this)

Performance Plumbing
Adam Bechtel (Yahoo!)
– your company grows and you build more datacenters
– then you figure out how to send data like email via direct pipes to save on expenses
– soon enough you have a backbone

Building an Automated Infrastructure
Adam Jacob (HJK Solutions)
-infrastructure is generally a set of interconnected structural elements…
-get slides
-automation makes thing repeatable and less prone for missing steps
Automation
– use kick start, jumpstart, system imager
– setup pxe boot server
– use dns server or use puppet for /etc/hosts
– server inventory…use a wiki or any other central place
– use ldap or AD for user management
– use version control
– config management tool
– cfengine
– puppet
– bcfg2
– vertebra
– system inventory tools
– iclassify
Monitoring your systems
– nagios
Application deployment
– capistrano based on rails
– controltier
– traffic grows, launch more ec2 images, system inventory recognizes new servers, talk to system inventory and configs get updated, deploy apps with capistrano, add user to ldap, add to monitoring system.
http://is.gd/EML – list of tools mentioned
“HJK does this for a living but you can ask me [him] how to do it for free.”
trending software for automation, munin and ganglia.
PUPPET
git is better than svn

Advertisements

Posted in Velocity 2008 | Tagged: | Leave a Comment »