' char* strInput =(char*) malloc(sizeof(char));
        int ch;
        int letNum = 0;
        while((ch = getchar()) != EOF){
                letNum++;
                strInput = (char*)realloc(strInput,letNum*sizeof(char));
                *(strInput + letNum - 1) = ch;
        }
        printf("\n");
        printf("%s\n",strInput);
        free(strInput);`

This is the contents of main in a program I wrote that takes an undefined number of chars and prints the final string. I don’t understand why but it only works if I press ctrl+D twice, and only once if I press enter before.

does anyone get what’s going on? And how would you have written the program?

  • offbyone
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    If you’re on Linux then I’m pretty sure the confusing behavior you’re seeing is due to the line buffering the kernel does by default. Ctrl+D does not actually mean “send EOF”, and it’s not the “EOF character”, rather it means “complete the current or next stdin read() request immediately”. That’s a very different thing, and sometimes it means EOF and other times it does not.

    In practice what this means is that, if there is no data waiting to be sent on stdin then read() returns zero, and read() returning zero is how getchar() knows an EOF happened. The flow looks like this:

    1. Your program calls getchar().
    2. getchar() calls read() on stdin and your program blocks waiting for input.
    3. The user presses Ctrl+D on the tty, having not typed anything else.
    4. The kernel immediately ends the blocked read() call and returns zero bytes read.
    5. getchar() sees that it got no bytes from read() and returns EOF.
    6. Your program sees that and exits the loop.

    However, in practice it doesn’t work that cleanly because the tty is normally operating in “cooked” mode, where the kernel sends input to your program line by line, allowing the user to edit a single line before sending it. The way this works is by buffering the stdin contents and sending it when the user hits enter. Going back to Ctrl-D, you can see how this screws things up, leading to the behavior you see:

    1. Your program calls getchar().
    2. getchar() calls read() on stdin and your program blocks waiting for input.
    3. The user types some input, but does not hit enter. This data sits in the kernel’s stdin buffer and is not send to your program yet.
    4. The user presses Ctrl+D on the tty.
    5. The kernel immediately ends the blocked read() call and starts returning the currently buffered stdin input, without waiting for an enter press.
    6. getchar() sees that it got a byte from read() and thus returns it.
    7. Your program starts getting all the previously buffered bytes and keeps running until getchar() has seen all of them.
    8. getchar() calls read() on stdin. There’s now no bytes in the buffer so you block waiting for input, the same as before. The previous Ctrl+D was already “used up” to end the previous read() call so it doesn’t matter any more.
    9. The user types Ctrl+D.
    10. Because there is currently no input in the line buffer, read() returns zero. getchar() sees this and returns EOF.

    In the above case Ctrl+D doesn’t work as expected because of the line buffering. The read() call ended early without waiting as expected, but your program just starts receiving all the buffered input so it doesn’t have any idea you pressed Ctrl+D and never gets the read() == 0 EOF condition. Additionally the Ctrl+D is a one-time deal, it ends one read() call early and sends the buffered input. When you call read() again with nothing to send it just blocks and you have to do another Ctrl+D to actually get read() to return zero.

    You can see the line buffering behavior if you add a putchar() inside your loop. The putchar() doesn’t actually print while you type the characters, it only prints after you hit either enter or Ctrl+D, showing that your program did not receive any of the characters until one of those two actions happened.

    • rastignac@programming.devOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Thanks a lot for the in depth explanation, this makes things a lot clearer. I’ll try ‘putchar()’ and test a few more things and then come back to read this post again