Everything Penguin

Focusing on Linux-based Operating Systems
htDig Search:

Operating Systems
  • /pub/OS/Linux

  • Storage
  • File Systems
  • HPC
  • /pub/Storage

  • Networking
  • /pub/Networking

  • Network Services
  • /pub/NetworkServices

  • Security
  • /pub/Security
  • Keytool/OpenSSL

  • Clustering
  • HA
  • DRM

  • Development
  • Design
  • C/C++
  • Java
  • Perl
  • Python
  • Shell
  • Web / J2EE

  • Not Linux ?
  • BSD
  • HP-UX
  • Mac
  • Solaris
  • VM
  • Windows
  • /pub/OS

  • Other
  • /pub
  • /pub/3rdParty
  •  Parent Directory

    Linux IO Schedulers
    Brett Lee
    ====================================
    
    No doubt you've heard of CPU schedulers.  But have you heard of I/O
    schedulers?  Yep, plural.
    
    There are a few:
    
     - Elevator - 2.4
     - No-op - 2.6
     - Deadline - 2.6
     - Anticipatory Deadline - default from 2.6.0 -> 2.6.18
     - Completely Fair Queuing - default from 2.6.18
    
    
    Here are some details:
    
      * The Linus Elevator (elevator)
        - Default scheduler for the 2.4 kernel
        - Attempts to minimize seek time
        - Rather than spatter about, the head reads steadily in one direction
          and then the other
        - All read/write requests are immediately sorted/prioritized by location
          - This means if requests come in for the current area, it will delay
            the read or write of a distant area
          - Read requests typically block, whereas write requests typically stream
            to a fixed area
            - Result is writes-starving-reads
    
    
      * The NOOP Scheduler (noop)
        - Optional scheduler in the 2.6 kernel
        - The term "scheduler" is used rather loosely.  It is just a FIFO.
    
    
      * The Deadline I/O Scheduler (deadline)
        - Optional scheduler in the 2.6 kernel
        - Maintains the elevator as much as possible, but seperates read and write
          requests into two different FIFO queues
          - Each FIFO has a timeout, and requests are entered into the queue with
            a timestamp
          - Timeouts:  Read FIFO = 500 milliseconds :: Write FIFO = 5 seconds
        - If a read sits in the queue for > 500 milliseconds, this I/O scheduler
          stops the current write and handles the first (and a few more) read
          requests
          - In this way, the reads are starved for much less time
    
    
      * The Anticipatory I/O Deadline Scheduler (anticipatory deadline)
          - Default from 2.6 to 2.6.18
          - With the Deadline I/O Scheduler, there is still significant seek time
            bouncing back and forth between streaming writes and sequential reads
          - To overcome some of this, this I/O scheduler completes all the reads,
            and then, in anticipation that there may be another read for the next
            data blocks, the scheduler waits for the read request for up to 6
            milliseconds.  If a read request arrives, it is serviced and another
            anticipatory wait occurs.  The Anticipatory I/O Scheduler uses a
            heuristic to better guess for which processes to wait, resulting in
            both improved request latency and global throughput.
    
    
      * The Complete Fair Queueing Scheduler (cfq)
          - Default since 2.6.18
          - Details of CFQ are below:
    
      A great feature of the CFQ scheduler is the ability to set priorities and
      scheduling classes for I/O, much as is done for CPU time and processes.
    
      Three scheduling classes are:
    
      Scheduling class      Number   Possible priority
      =============================================================================
      real time     1       8        Priority levels are defined denoting how big a
                                     time slice a given process will receive on
                                     each scheduling window
      -----------------------------------------------------------------------------
      best-effort   2       0-7,     with lower number being higher priority
      -----------------------------------------------------------------------------
      idle          3       Nil      ( does not take a priority argument)
      -----------------------------------------------------------------------------
      Table from:
        http://www.cyberciti.biz/tips/linux-set-io-scheduling-class-priority.html
    
      If the CFQ I/O scheduler is enabled, then from the command line a typical
      job can be redefined to use a specific scheduling priority using `ionice`:
    
      [root@linux ~]# ionice -h
      Usage: ionice [OPTIONS] [COMMAND [ARG]...]
      Sets or gets process io scheduling class and priority.
    
            -n      Class data (typically 0-7, lower being higher prio)
            -c      Scheduling class
                            1: realtime, 2: best-effort, 3: idle
            -p      Process pid
            -h      This help page
    
    
      Much like nice or priocntl, a job can be started with ionice, or it can be
      modified after it is running.  In addition, the job can be started with both
      nice and ionice.
    
      For example: ( I know, it's a *bad* example as there's no I/O)
      [root@linux ~]# nice -n -19 ionice -c1 -n7 top
    
      For more on CFQ, see:
            http://www.redhat.com/magazine/008jun05/features/schedulers/
    
    
    
    
    
    
    Take CFQ out for a test drive:
    ============================================
    
    * To see what I/O schedulers are configured, run:
    
      [root@linux ~]# cat /sys/block/sda/queue/scheduler
      noop anticipatory deadline [cfq]
    
      These are built into the kernel, and CFQ is the default
    
    
    * To test an I/O scheduler on the fly, run this for each of the disks:
    
      echo 'deadline' > /sys/block/sda/queue/scheduler'
    
    
    * To set an I/O scheduler (other than the default) to run at boot time,
      append "elevator=XXX" to the kernel line in /boot/grub/grub.conf,
      where XXX is the name of the scheduler (noop, deadline, etc.)
    
      e.g. kernel /vmlinuz ro root=/dev/sda1 elevator=deadline
    
    
    * In addition to having multiple I/O schedulers available and (with CFQ)
      multiple I/O classes and priorities, there are still some additional
      values to tweak to get the most out of the scheduler:
    
      [root@linux ~]# find /sys/block/sda/queue
      /sys/block/sda/queue
      /sys/block/sda/queue/iosched
      /sys/block/sda/queue/iosched/slice_idle
      /sys/block/sda/queue/iosched/slice_async_rq
      /sys/block/sda/queue/iosched/slice_async
      /sys/block/sda/queue/iosched/slice_sync
      /sys/block/sda/queue/iosched/back_seek_penalty
      /sys/block/sda/queue/iosched/back_seek_max
      /sys/block/sda/queue/iosched/fifo_expire_async
      /sys/block/sda/queue/iosched/fifo_expire_sync
      /sys/block/sda/queue/iosched/queued
      /sys/block/sda/queue/iosched/quantum
      /sys/block/sda/queue/scheduler
      /sys/block/sda/queue/max_sectors_kb
      /sys/block/sda/queue/max_hw_sectors_kb
      /sys/block/sda/queue/read_ahead_kb
      /sys/block/sda/queue/nr_requests
      [root@linux ~]#
    
    
      For more, see:
      http://www.linuxjournal.com/article/6931
      http://www.cyberciti.biz/tips/linux-set-io-scheduling-class-priority.html
      http://archives.postgresql.org/pgsql-performance/2009-04/msg00166.php
      http://blog.4rev.net/2009-07/ionice-linux-utility-io-scheduling-class-priority-for-a-program-or-script/
      http://friedcpu.wordpress.com/2007/07/17/why-arent-you-using-ionice-yet/
      http://www.wlug.org.nz/LinuxIoScheduler
    
    
    

    Other Sites

    RFC's
  • FAQ's
  • IETF
  • RFC Sourcebook

  • Linux
  • Linux - Intro
  • Linux Kernel
  • Linux Kernel (LKML)
  • Bash - Intro
  • Bash - Advanced
  • Command Line
  • System Administration
  • Network Administration
  • Man Pages (& more)
  • More Guides
  • Red Hat Manuals
  • HOWTO's

  • Reference/Tutorials
  • C++ @ cppreference
  • C++ @ cplusplus
  • CSS @ echoecho
  • DNS @ Zytrax
  • HTML @ W3 Schools
  • Java @ Sun
  • LDAP @ Zytrax
  • Linux @ YoLinux
  • MySQL
  • NetFilter
  • Network Protocols
  • OpenLDAP
  • Quagga
  • Samba
  • Unix Programming



  • This site contains many of my notes from research into different aspects of the Linux kernel as well as some of the software provided by GNU and others. Thouugh these notes are not fully comprehensive or even completetly accurate, they are part of my on-going attempt to better understand this complex field. And, they are your to use.

    Should you wish to report any errors or suggestions, please let me know.

    Should you wish to make a donation for anything you may have learned here, please direct that donation to the ASPCA, with my sincere thanks.

    Brett Lee
    Everything Penguin

    The code for this site, which is just a few CGI scripts, may be found on GitHub (https://github.com/userbrett/cgindex).

    For both data encryption and password protection, try Personal Data Security (https://www.trustpds.com).


    "We left all that stuff out. If there's an error, we have this routine called 'panic', and when its called, the machine crashes, and you holler down the hall, 'Hey, reboot it.'"

        - Dennis Ritchie on Unix (vs Multics)


    Google
    [ Powered by Red Hat Linux ] [ Powered by Apache Server] [ Powered by MySQL ]

    [ Statistics by AWStats ]