Sunday, July 13, 2008

some most useful bash features

edit-and-execute-command (C-xC-e)
Invoke an editor on the current command line, and execute the result as shell com‐
mands. Bash attempts to invoke $FCEDIT, $EDITOR, and emacs as the editor, in that
order.

insert-completions (M-*)
Insert all completions of the text before point that would have been generated by pos‐
sible-completions.

complete-into-braces (M-{)
Perform filename completion and insert the list of possible completions enclosed
within braces so the list is available to the shell (see Brace Expansion above).

insert-comment (M-#)
Without a numeric argument, the value of the readline comment-begin variable is
inserted at the beginning of the current line. If a numeric argument is supplied,
this command acts as a toggle: if the characters at the beginning of the line do not
match the value of comment-begin, the value is inserted, otherwise the characters in
comment-begin are deleted from the beginning of the line. In either case, the line is
accepted as if a newline had been typed. The default value of comment-begin causes
this command to make the current line a shell comment. If a numeric argument causes
the comment character to be removed, the line will be executed by the shell.

Keyboard Macros
start-kbd-macro (C-x ()
Begin saving the characters typed into the current keyboard macro.
end-kbd-macro (C-x ))
Stop saving the characters typed into the current keyboard macro and store the defini‐
tion.
call-last-kbd-macro (C-x e)
Re-execute the last keyboard macro defined, by making the characters in the macro
appear as if typed at the keyboard.

display-shell-version (C-x C-v)
Display version information about the current instance of bash.

Friday, July 11, 2008

HOWTO ptrace a multi-process programs?

HOWTO-ptrace-multi-process-programs?



References:

  • Playing with ptrace, Part I (LinuxJournal)

  • Playing with ptrace, Part II (LinuxJournal)

  • 以 ptrace 系統呼叫來追蹤/修改行程 (JservBlog)

  • one line curl paste

    Recently an utility named wgetpaste run into my field, but after coming across it, I found it was really a shell script, it just organizes arguments and transfer to wget, in use of post mode of wget,



    $ head -n1 /usr/bin/wgetpaste
    #!/bin/sh


    In fact, I found that curl is better at this job, when I post some code to a paste service, just one curl command is enough:



    $ curl -d poster=chengrq -d syntax=c --data-urlencoded content@file.c \
    http://paste.ubuntu.com


    And, if you would like to read from a pipe other than a file, you could use '-' to replace the filename, it always did like what you think of that!



    BTW, ubuntu did a good job on the paste service, supporting many syntax highting.

    undocumented getopt

    首先说说getopt的规则用法,查看相关手册:



    $ man 3 getopt:

    #include <unistd.h>

    int getopt(int argc, char * const argv[],
    const char *optstring);

    extern char *optarg;
    extern int optind, opterr, optopt;

    #define _GNU_SOURCE
    #include <getopt.h>

    int getopt_long(int argc, char * const argv[],
    const char *optstring,
    const struct option *longopts, int *longindex);

    int getopt_long_only(int argc, char * const argv[],
    const char *optstring,
    const struct option *longopts, int *longindex);



    这一段是说只需要包含了unistd.h头文件就可以使用getopt了,它的函数原型使用argc,argv,optstring这三个参数;而如果需要使用GNU的扩展函数getopt_long,则还需要包含头文件getopt.h,并且在头文件之前定义_GNU_SOURCE宏,可以在C源文件中包含getopt.h之前直接define,而我一般是在Makefile里面写在CFLAGS上直接传参数给gcc;



    于是具体的编译命令是

    gcc -Wall -D_GNU_SOURCE filename.c
    这样如果大型项目使用了多个C文件则不必在每一个C文件中都写上define _GNU_SOURCE,只在Makefile写一次,减少了总体字节数。



    将其中的例子取出来测试:




    Download as text

     1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    #include <unistd.h>
    #include <stdlib.h>
    #include <stdio.h>

    int main(int argc, char *argv[])
    {
    int flags, opt;
    int nsecs, tfnd;

    nsecs = 0;
    tfnd = 0;
    flags = 0;
    while ((opt = getopt(argc, argv, "nt:")) != -1) {
    switch (opt) {
    case 'n':
    flags = 1;
    break;
    case 't':
    nsecs = atoi(optarg);
    tfnd = 1;
    break;
    default: /* ? */
    fprintf(stderr, "Usage: %s [-t nsecs] [-n] name\n",
    argv[0]);
    exit(EXIT_FAILURE);
    }
    }

    printf("flags=%d; tfnd=%d; optind=%d, nsecs=%d\n",
    flags, tfnd, optind, nsecs);

    if (optind >= argc) {
    fprintf(stderr, "Expected argument after options\n");
    exit(EXIT_FAILURE);
    }

    while (optind < argc)
    printf("name argument = %s\n", argv[optind++]);

    /* Other code omitted */

    exit(EXIT_SUCCESS);
    }



    Download as text


    编译和运行:



    $ gcc -Wall opt1.c
    $ ./a.out -t 3 -n arg1 -t 4 arg2
    flags=1; tfnd=1; optind=6, nsecs=4
    name argument = arg1
    name argument = arg2


    可见:




    1. 同一个参数指定多项时以最后一项指定生效,(-t4覆盖了-t3选项值);

    2. 最后多作的参数可以使用optind索引开始访问,getopt在调用过程中对argv字符串数组进行了permute,把所有非option的argument都移到最后去了;



    那么如果需要一个非option的argument参数并且是以"-"开头的,怎么输入?



    manual附带提到了一点,可以使用"--"来关闭option扫描:



    $ ./a.out -t 3 -n arg1 -t 4 arg2 -- -t 6 --new null
    flags=1; tfnd=1; optind=7, nsecs=4
    name argument = arg1
    name argument = arg2
    name argument = -t
    name argument = 6
    name argument = --new
    name argument = null



    在这次运行中,"-t 6"已经不被看作是option,而是作为多余的argument交给了程序。




    然后来说说optarg,optind,optopt,opterr的用途:





    1. optarg作为后接":"字符的option对应所需要的参数;

    2. optind是getopt处理完成(返回EOF表示完成)之后argument的位置,罽getopt只对argv数组重排序,不改变argc的大小,因此从optind到argc循环便可得所有非option参数;

    3. optopt和opterr都是用于处理用户输入未定义的option字符的情况,此时getopt返回'?'字符,将用户输入的真实的option放在optopt,并且自动向stderr打印一条错误消息,如"invalid option -- h",如果不想要这个错误提示可以预先设置opterr为0可关闭这个错误提示;



    手册中还详细介绍了optstring的用法:



    1. 单个的字符是开关量;

    2. 单个字符后接":"则意味着需要参数,在getopt调用后这个参数会被放入optarg这个全局量中;
    3. 如果接上两个冒号"::",则意味着参数是可选的,如"f::"对应输入"-ffoo"时,optarg获得"foo",而"-f foo"则optarg获得NULL,"foo"被计入argument部分;

    4. 如果optstring以"+"开头,则getopt停止于第一个非option处,如"-t 3 -n arg1 -t 4 arg2"会停止于"arg1"处将从它开始的"arg1 -t 4 arg2"都自动作为argument;

    5. 如果optstring以"-"开头,则getopt会将所有非option项(即argument)都以optiont为1处理(注意是数字1不是字符'1'),将其argument作为optarg;

    6. optstring中在前缀"+"或"-"后的第一个字符如果是':'冒号,则对未识别的option字符返回':'冒号而不是'?',参见optopt和opterr描述;



    以上部分都是手册里面提到过的正常的使用getopt,下面是一段在实用过程中发现的getopt的undocumented特性:


    有一个程序需要对getopt作多次循环调用:

    一般的getopt对argc,argv调用返回了所有参数后,在最后一次返回EOF表示argv数组已处理完,这里我将它称为一轮getopt循环调用;

    多次循环调用指的定在一次循环调用之后如何在程序里继续使用getopt处理其它的argc,argv?


    可能有人说,对新的argc,argv直接调用getopt就是了,~~~,其实不然:

    getopt能够循环调用的原理是它在内部使用了静态变量,保存了对argc,argv操作的状态,当第一轮getopt处理完时,这些内部的静态变量已记录处理到argc,argv到最后;因此如果直接使用getopt对新的argc,argv组进入调用时,会发现其根本不工作,直接返回EOF;


    于是有必要对getopt的内部原理作一番研究,已知它是标准C库提供的函数,于是找到glibc:

    apt-get source glibc
    或者
    ebuild /usr/portage/sys-libs/glibc/glibc-x.x.ebuild unpack
    取得源代码,找到实现这个库函数的posix/getopt.c文件,这里将它上传到了paste,分析其中初始化这些静态变量可以发现规律是:

    它会根据optind的值进行初始化(406行,1133行);

    调用顺序及相关数据结构是:



    static struct _getopt_data getopt_data;


    getopt (int argc, char *const *argv, const char *optstring)


    _getopt_internal (int argc, char *const *argv, const char *optstring,


      getopt_data.optind = optind;


     _getopt_internal_r (int argc, char *const *argv, const char *optstring,


      if (d->optind == 0 || !d->__initialized)


          optstring = _getopt_initialize (argc, argv, optstring, d);


    至此可知在d->optind为0或者d->__initialized为假的情况下会调用_getopt_initialize重新初始化;而调用_getopt_internal_r传入的d指针参数始终是&getopt_data,这是一个模块级全局变量(即static全局变量),在C库的此文件之外不可见,因此无法以直接的getopt_data.optind方式设置它;但从前面的调用路径可知d->optind每次都从全局变量optind初始化;因此我们可以设置optind为0然后再调用getopt让传入的d->optind为0使_getopt_initialize再一次被调用,让它从新初始化getopt所使用的内部静态变量(即getopt_data),达到可以重新一轮调用getopt的目的;



    于是程序中,需要新一轮调用getopt之前作一次

    optind=0;
    的设置,于是getopt继续工作!



    总结起来:


    1. getopt(3)手册中中提到了optarg,optind,optopt的输出(out类型参数)作用,即对用户程序是只读的,在getopt调用之后从它们可以读到一些信息;而opterr是只写(in类型参数)的,须在getopt调用之前进行设置为0或1,getopt函数实现中不改变opterr,而只根据opterr值有一些不同的行为;

    2. 通过对C库中的getopt函数实现的阅读可以了解到:optind还可以作为输入(in类型)参数,在getopt调用之前将optind置0可以使得getopt函数内部重新初始化;

    3. getopt.c:114行对optind的注释中提到"zero means the first call",而手册(manual)中未提到optind的in参数功能。