A better way to parse variables in bash.

I am sure many of you have the problem where at some point in your bash script you have a large blob of formatted key value pair text and you need that data as variables in your script! Well their are lots of ways to do this.
One way you can run a loop and toss key values into a pair of arrays and then search the key array for and index number and retrieve the value from value array (or if you don’t support older OS’s use an associative array). This is cumbersome and leads to some difficult to read code.
Another popular method is to use grep and pull each key value pair. This is not as flexible, but leads to easier to read code. It also leads to a very nasty debug output.
Their are several more ways that I have tried and I have hated all of them. Until today! Today I was presented with the most completely awesome bash trick I have seen in at least a year. And I will share this with you!

First let me describe the problem…. Since I am a MySQL DBA I am going to use show slave status as an example. And for sake of example I will use the grep | awk (or cut )  method mentioned above.

function parse_show_slave_status(){
    unset Master_Host
    unset Master_Port
    unset Master_Logs_Pos
    unset Master_Log_File
    unset Slave_IO_Running
    unset Slave_SQL_Running
    unset Relay_Master_Log_File
    unset Seconds_Behind_Master
    unset Exec_master_log_pos
    unset Last_error
    unset Last_errno
    local  status=$1
    if [[ -z $status ]]; then
        return 1
    fi
    Slave_IO_Running=$( echo "$status" | grep -i 'Slave_IO_Running' | awk '{print $2}')
    Slave_SQL_Running=$( echo "$status" | grep -i 'Slave_SQL_Running' | awk '{print $2}')
    Seconds_Behind_Master=$( echo "$status" | grep -i 'Seconds_Behind_Master' | awk '{print $2}')
    Master_Host=$(echo "$status" | grep -i 'Master_Host' | awk '{print $2}')
    Master_Port=$(echo "$status" | grep -i 'Master_Port' | awk '{print $2}')
    Master_Logs_Pos=$(echo "$status" | grep -i 'Read_Master_Log_Pos' | awk '{print $1}')
    Master_Log_File=$(echo "$status" | grep -i ' Master_Log_File' | awk '{print $2}')
    Exec_master_log_pos=$(echo "$status" | grep -i 'Exec_master_log_pos' | awk '{print $2}')
    Relay_Log_File=$(echo "$status" | grep -i 'Relay_Log_File' | awk '{print $2}')
    Relay_Log_Pos=$(echo "$status" | grep -i 'Relay_Log_Pos' | awk '{print $2}')
    Relay_Master_Log_File=$(echo "$status" | grep -i 'Relay_Master_Log_File' | awk '{print $2}')
    Last_error=$(echo "$status" | grep -i "Last_error" | sed -e 's/^[ \t]*Last_Error: //I' )
    Last_errno=$(echo "$status" | grep -i "Last_errno" | awk '{print $2}')
    return 0
}
slave_status=$(mysql -h$host -e 'SHOW SLAVE STATUS\G')
parse_show_slave_status $slave_status
if [[ -$? -eq 0 ]]; then
    echo $Slave_IO_Running
fi

 

Now the above works. It is easy to read. And will echo Yes or No depending. However it will give you 12 copies of show slave status in the logs when you run it with bash -x. This is annoying as sin. In addition it is quite cumbersome to read through, adds little to your debugging efforts, and takes longer to process. And most importantly it can all be replaced with a simple printf statement – Behold The Glory!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
function parse_show_slave_status(){
     local  status=$1
     if [[ -z $status ]]; then
       return 1
     fi
    while read sskey ssvalue; do       key=$(echo $sskey | sed "s/://")       printf -v "$key" "$ssvalue"    done < <( echo "$status" )     return 0
} 
slave_status=$(mysql -h$host -e 'SHOW SLAVE STATUS \G') 
parse_show_slave_status $slave_status 
if [[ -$? -eq 0 ]]; then
     echo "$Master_Host : $Master_Port" 
fi


*NOTE: When we change theams and code formaters around some times
done < <( echo "$status" ) gets rendered with &lt; instead of <

Now that, my friends, is the way to do it 🙂

First of all this is much shorter and easier to read. Though if I was doing this as part of a larger script I would put in a comment block listing the variables that get set in this script so it would be easier to read. Never the less every key, like Relay_Master_Log, or Exec_Master_Log_Pos, or Master_Host, or Master_Port gets set with a value. IF you are confused I would again like to direct you to the highlighted text on this page that show what the output of SHOW SLAVE STATUS \G looks like.
Now this does have some drawbacks.

  1. I have not managed to make these local variables
  2. Even unused or undesired variables will consume space.
  3. This is not SH compatible due to the use of process substitution to feed variables to the while loop.
  4. Also You must have BASH version 3 or better – So all you people rocking a decade old OS are out of luck 😉
    You can find your bash version simply by ~]$ echo $BASH_VERSION

 

 

Now I do not consider this to be a show stopper because, or even much of an issue at all. Usually I want a large number of the keys from a blob, if not all of them. In addition, unless you are returning only a single value from a function…. Or passing serialized JSON objects…. Which I occasionally do… global variables are probably what you want to be using. So while not perfect it is a darn nice trick!
Special thanks to David (DXJ) for showing me this awesome trick and, as always, Catlin who co authors this site with me.

About

I am a DBA, a programmer, and a sysadmin. My title is Engineer and I work for one of the largest Domain and Hosting companies in the world. We are also one of the largest MySQL shops on the planet, as well as being responsible for PostgreSQL, MongoDB, Cassandra, Hadoop, Redis, Memcache, and even some MS SQL.
My team here is amassing and we do amazing things every day. And this is a spot for me to talk about it.

We control millions of databases. We control the hardware. We control the software. Do not attempt to adjust your scheme. We are watching.