647

What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?

Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265
Zolomon
  • 9,359
  • 10
  • 36
  • 49
  • 1
    I was pretty sure that we've dne this befor, but I can't find it. See also: http://stackoverflow.com/questions/2301372/undefined-unspecified-implementation-defined-behaviour-warnings – dmckee --- ex-moderator kitten Mar 07 '10 at 22:02
  • 1
    http://theunixshell.blogspot.com/2013/07/what-is-undefined-behavior.html – Vijay Jul 10 '13 at 06:36
  • 1
    Here is [an interesting discussion](http://www.drdobbs.com/go-parallel/article/print?articleId=232901670&siteSectionName=) (the section "Annex L and Undefined Behavior"). – Owen Jul 21 '13 at 04:37
  • 2
    From the comp.lang.c FAQ: [People seem to make a point of distinguishing between implementation-defined, unspecified, and undefined behavior. What do these mean?](http://c-faq.com/ansi/undef.html) – jamesdlin Mar 08 '10 at 02:37
  • https://en.cppreference.com/w/cpp/language/ub – Jesper Juhl Mar 29 '23 at 12:49

9 Answers9

485

Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!

Let's look at a classic example:

#include <iostream>
    
int main()
{
    char* p = "hello!\n";   // yes I know, deprecated conversion
    p[0] = 'y';
    p[5] = 'w';
    std::cout << p;
}

The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to the C++ standard, [lex.string] note 4, it invokes undefined behavior:

The effect of attempting to modify a string literal is undefined.

I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.

Other examples of undefined behavior include

[intro.defs] also defines undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:

implementation-defined behavior    [defns.impl.defined]

behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents

unspecified behavior    [defns.unspecified]

behavior, for a well-formed program construct and correct data, that depends on the implementation

[Note: The implementation is not required to document which behavior occurs. The range of possible behaviors is usually delineated by this document. — end note]

undefined behavior    [defns.undefined]

behavior for which this document imposes no requirements

[Note: Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [...] — end note]

What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • 11
    It's a weird fact that resulted from the merge that this answer only covers C++ but this question's tags includes C. C has a different notion of "undefined behavior": It will still require the implementation to give diagnostic messages even if behavior is also stated to be undefined for certain rule violations (constraint violations). – Johannes Schaub - litb Nov 20 '10 at 04:45
  • @Johannes: That's bad, indeed. Why not link to several answers from the question? – sbi Nov 20 '10 at 16:19
  • 1
    In the code snippet, is this case (modifying a string litteral) an undefined behaviour because the string litteral might have been allocated to a read-only text segment? – Benny Jan 17 '13 at 10:40
  • 15
    @Benoit It is undefined behavior because the standard says it's undefined behavior, period. On some systems, indeed string literals are stored in the read-only text segment, and the program will crash if you try to modify a string literal. On other systems, the string literal will indeed appear change. The standard does not mandate what has to happen. That's what undefined behavior means. – fredoverflow Jan 17 '13 at 13:37
  • I suggest mentioning compiler optimisations: the compilers can and do completely remove entire blocks with undefined behaviour, whereas relying on unspecified behaviour is stupid because the compiled code might not do what you expect, but it can't be removed. (GCC at -O2 uses certain undefined behaviours to mark other code in the block as dead!) – Nicholas Wilson Apr 29 '13 at 09:29
  • 8
    @FredOverflow, Why does a good compiler allow us to compile code that gives undefined behavior? Exactly what *good* can compiling this kind of code give? Why didn't all good compilers give us a huge red warning sign when we are trying to compile code that gives undefined behavior? – Pacerier Sep 27 '13 at 08:53
  • 1
    So could a (standard-compliant) compiler output a binary that nukes the hard drive/bricks your device/installs malware when you give it a program with undefined behavior? – AJMansfield Nov 12 '13 at 01:51
  • 18
    @Pacerier There are certain things that are not checkable at compile time. For example it is not always possible to guarantee that a null pointer is never dereferenced, but this is undefined. – Tim Seguine Dec 08 '13 at 14:16
  • 1
    Beyond just being proficient at C++ legalese/spec (read good books) you also need to read your compiler manuals to understand what implementation defined and unspecified behavior you're working in. Especially helpful for microcontrollers since they often use proprietary compilers which makes it harder to google for help when you encounter something weird. – slebetman Jun 05 '14 at 00:37
  • " those aspects of the C++ language that can be surprising to programmers coming from other languages. Basically, it is possible to write C++ programs that do not behave in a predictable way" this is not a very good sentence because saying this makes it sound like undefined behaviour is a synonym for non-deterministic. – Celeritas Oct 01 '14 at 06:34
  • 5
    @Celeritas, undefined behavior *can* be non-deterministic. For example, it is impossible to know ahead of time what the contents of uninitialized memory will be, eg. `int f(){int a; return a;}`: the value of `a` may change between function calls. – Mark Oct 12 '15 at 20:56
  • 1
    @AJMansfield: Absolutely. More to the point, it can produce a program which has a security vulnerability, which allows an attacker to install malware. Probably most security vulnerabilities are a result of exploiting undefined behaviour. – Martin Bonner supports Monica Dec 22 '15 at 12:11
  • @MartinBonner: Even more to the point, it can lead a compiler to omit bounds checks which would only be relevant if the program would invoke Undefined Behavior, with the effect that even if the original UB would have been benign (e.g. an overflow in a computation whose result ends up getting ignored) the omitted bounds check could allow the out-of-range value to trigger dangerous behavior. – supercat Aug 05 '16 at 20:49
  • 2
    @fredoverflow continuing your string literal example - if a string literal is changed; it could impact other variables - take the example above and add `auto q= "hello!\n";` q now may or may not be "hello!\n" or "yellow\n" and worse; q may or may not be in the same function as p. – UKMonkey Jun 15 '18 at 13:13
  • 3
    @Pacarier Because undefined behavior was deliberate room left for useful or intended variations between systems. Systems existed where dereferencing the zero address gets zero values, and dereferencing the null pointer (undefined behavior!) to get automatic zero values was very useful on those systems. Systems exist where the hardware does saturating arithmetic, and overflowing or underflowing signed integers (undefined behavior!) is useful on that hardware. C abstracts hardware variations, but a major use of C is to access functionality and tradeoffs unique to various hardware. – mtraceur Sep 28 '20 at 19:20
  • @fredoverflow __code__ `int a; char s[20]; cin>>a; cin>>s;` __input__: `23bonapart` + Enter key hit now, __code__ `cout< – Abhishek Mane Apr 11 '21 at 08:57
  • @mark A good compiler might just delete every code path that calls the `f` function because it invokes undefined behaviour: reading an uninitialized variable. Your example doesn't seems non-deterministic to me. – Ekrem Dinçel Apr 30 '21 at 19:38
  • @EkremDinçel: For some definitions of "good", maybe. On the other hand, on many platforms, a compiler that guarantees that reading an uninitialized value of type `int` will simply yield an unspecified value of that type without side effects, given a program that exploits that guarantee, may be able to generate more efficient code to accomplish a task than could be generated for a program that didn't exploit it. – supercat May 30 '21 at 16:11
  • @supercat I don't remember why exactly I wrote my above comment. Looking at it again, undefined behavior can absolutely cause non-determinism at runtime. I guess I just wanted to point out that what Mark wrote is not the only possible outcome of that code; and since the code includes UB, one shouldn't rely on it unless they know what they are doing. – Ekrem Dinçel Jun 12 '21 at 16:53
  • @EkremDinçel: A fundamental problem with the evolution of the C language is that the authors of the Standard expected that compilers writers would want people to buy their products in a competitive marketplace, and there was thus no need to forbid them from behaving in ways that would be universally recognized as obtuse. It was never intended that programmers should have to jump through hoops to prevent obtuse nln-commercial compilers breaking code that all commercial compilers would have processed identically. – supercat Jun 12 '21 at 17:03
120

Well, this is basically a straight copy-paste from the C standard:

3.4.1 1 implementation-defined behavior unspecified behavior where each implementation documents how the choice is made

2 EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right.

3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

2 NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

3 EXAMPLE An example of undefined behavior is the behavior on integer overflow.

3.4.4 1 unspecified behavior use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance

2 EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are evaluated.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 4
    What's the difference between implementation-defined and unspecified behaviour? – Zolomon Mar 07 '10 at 21:23
  • 31
    @Zolomon: Just like it says: basucally the same thing, except that in case of implementation-defined the implementation is requred to document (to guarantee) what exactly is going to happen, while in case of unspecified the implementation is not required to document or guarantee anything. – AnT stands with Russia Mar 07 '10 at 21:27
  • 1
    @Zolomon: It's reflected in the difference between 3.4.1 and 2.4.4. – sbi Mar 07 '10 at 21:28
  • Is it possible for a compiler to leave undefined behavior un implemented? What would happen? For example `a[i] = i++` is undefined. Is it possible for a compiler to be sophisticated enough to be programed for this event? What would happen, it would just output random assembly? – Celeritas Jun 14 '14 at 00:32
  • @Celeritas: If the compiler recognizes UB, it can prune that branch of execution. If it does not, whatever it outputs might be non-sensical and give the processor severe indigestion. – Deduplicator Sep 20 '14 at 21:26
  • 13
    @Celeritas: Hyper-modern compilers can do better than that. Given `int foo(int x) { if (x >= 0) launch_missiles(); return x << 1; }` a compiler can determine that since all means of invoking the function that don't launch the missiles invoke Undefined Behavior, it can make the call to `launch_missiles()` unconditional. – supercat May 05 '15 at 17:24
  • @AnT: what does it mean that "this international Standard imposes no requirements" in undefined behavior? – Destructor Jul 12 '15 at 05:16
  • @Destructor: Given that C was in wide use for years before the first Standard was written, the phrase "the Standard imposes no requirements" used to mean that implementations similar to those that predated the standard should behave similarly to such pre-existing implementations, absent a compelling reason to do otherwise. In many cases, things were left UB to avoid mandating that existing implementations to change in ways that might make them less efficient or break existing code. – supercat Aug 05 '16 at 20:53
  • @AnT then what is the difference between unspecified and undefined behaviour, if neither guarantee any particular thing to happen? Is it that for unspecified, the same thing must happen each time the program is run on the same machine? – northerner Mar 21 '17 at 07:31
  • 4
    @northerner As the quote states, unspecified behavior is usually restricted to a limited set of possible behaviors. In some cases you might even come to conclusion that all of these possibilities are acceptable in the given context, in which cases unspecified behavior is not a problem at all. Undefined behavior is completely unrestricted (e.b. "the program may decide to format your hard drive"). Undefined behavior is always a problem. – AnT stands with Russia Mar 21 '17 at 07:41
  • @northerner: Doesn't say what exactly? By "a limited set of possible behaviors" I'm referring to the part where the quite says "provides two or more possibilities". – AnT stands with Russia Mar 21 '17 at 07:56
  • @AnT That seems basically the same to undefined behavior. Assuming a computer has a finite number of states, the point "limited set of possible behaviors" is moot. I guess the implicit distinction is unspecified behavior has a much smaller set of possibilities than undefined. – northerner Mar 21 '17 at 08:32
  • 1
    @northerner : Nope. The key point of in case of "unspecified behavior" is the fact that the set of possible behaviors is actually *specified* and *restricted*. And typically it is an easily overseeable set. What's unspecified is which specific possibility (out of that set) will be chosen by the compiler. The implicit distinctions is that the set of posibilities is typically restgricted and [fairly] platform-independent. In case of undefined behavior is unrestricted (and therefore is platform-dependent under your "finite number of states" reasoning). – AnT stands with Russia Mar 21 '17 at 08:35
  • 1
    The language specification does not mention "formatting your hard drive" anywhere. Yet, it is not a possibility undfer any of *unspecified* behaviors, but a possibility under any of *undefined* behaviors. This emphasizes the distinction rather well. – AnT stands with Russia Mar 21 '17 at 08:41
  • @AnT I advise to remove this info from the comments and copy it into the answer. I would myself but edit is disabled. – northerner Mar 21 '17 at 09:13
  • @AnT: I wouldn't describe the distinction as "platform independent", but quite the opposite. On a platform where reads of various addresses trigger hardware actions (many I/O cards for the Apple II, *including the disk controller card*, work that way), it should not be surprising if stray reads trigger unwanted hardware actions, including erasing disks. That does not imply programmers should be required to avoid stray reads at all costs when targeting platforms where they would, at worst, yield a meaningless value. – supercat Jul 29 '19 at 17:48
  • I wish you had a traceable reference to the source document for these quotes so they could be more-easily verified and tracked for changes. – Gabriel Staples Nov 22 '21 at 03:24
  • 1
    @GabrielStaples: I think the latest published Rationale document is at http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf (from what I can tell, no subsequent version of the Standard has a published Rationale document). – supercat Feb 14 '22 at 21:54
71

Maybe simpler wording could be easier to understand than the rigorous definition of the standards.

implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.

undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.

unspecified behavior:
Which of these two functions is executed first?

void fun(int n, int m);

int fun1() {
    std::cout << "fun1";
    return 1;
}
int fun2() {
    std::cout << "fun2";
    return 2;
}

//...

fun(fun1(), fun2()); // which one is executed first?

The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.


@eSKay I think your question is worth editing the answer to clarify more :)

for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?

The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:

It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.

IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.

Daniel Walker
  • 6,380
  • 5
  • 22
  • 45
Khaled Alshaya
  • 94,250
  • 39
  • 176
  • 234
  • 3
    for `fun(fun1(), fun2());` isn't the behaviour `"implementation defined"`? The compiler has to choose one or the other course, after all? – Lazer Mar 08 '10 at 05:14
  • 1
    @AraK: thanks for the explaining. I understand it now. Btw, `"I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left"` I understand this `can` happen. Does it really, with compilers that we use these days? – Lazer Mar 08 '10 at 10:23
  • 1
    @eSKay You have to ask a guru about this who got his hands dirty with many compilers :) AFAIK VC evaluates arguments right-to-left always. – Khaled Alshaya Mar 08 '10 at 10:28
  • 5
    @Lazer: It can definitely happen. Simple scenario: foo(bar, boz()) and foo(boz(), bar), where bar is an int and boz() is a function returning int. Assume a CPU where parameters are expected to be passed in registers R0-R1. Function results are returned in R0; functions may trash R1. Evaluating "bar" before "boz()" would require saving a copy of bar somewhere else before calling boz() and then loading that saved copy. Evaluating "bar" after "boz()" will avoid a memory store and re-fetch, and is an optimization many compilers would do regardless of their order in the argument list. – supercat Mar 21 '11 at 20:12
  • 6
    I don't know about C++ but the C standard says that a conversion of an int to a char is either implementation defined or even well defined (depending on the actual values and signedness of types). See C99 §6.3.1.3 (unchanged in C11). – Nikolai Ruhe Jan 14 '13 at 10:18
  • AFAIK, __Unspecified:__ _Here's a list of things you can do if you find this, pick the one you like the most_. __Implementation-defined:__ _do what you want to do to solve this problem, as long as it's effective_. __Undefined:__ _do what you want. or don't do anything, I don't care. I'm not here to babysit careless programmers._ – Jenny T-Type Mar 09 '18 at 08:34
  • @JennyT-Type: There is nothing "careless" about using non-portable constructs on quality compilers that support them. I'm not sure why people read the phrase "nonportable **or** erroneous" as implying "erroneous", or why compiler writers don't see the ability to usefully and efficiently process a wide range of non-portable programs as a trait of quality compilers. – supercat Jul 19 '18 at 15:37
  • @supercat. Well, I _could_ agree, C is a language for people that know what they're doing after all. right?. The thing is, chances are that we don't always know exactly what we're doing, otherwise the 'C' tag in SO wouldn't make sense at all. I've even came across questions here that make me thing the op was learning C by reading "C for Dummies" or similar material. Those people have no idea, whatsoever, of what they're doing. If you know your platform well enough to know what to expect every time you invoke undefined behavior, then congrats, you're a better programmer than me. – Jenny T-Type Sep 28 '18 at 10:38
  • @JennyT-Type: The Standard suggests as a typical behavior "Behave in a documented fashion characteristic of the environment". Historically, that used to mean that in most cases where an environment had characteristic behaviors that might be useful, implementations would expose them. Unfortunately, the Standard has never bothered to specify any means of saying "give me the platform behavior" since they figured that implementations where that would be useful would naturally do so without the programmer having to do anything special. – supercat Sep 28 '18 at 15:03
  • Saying "unspecified behaviour may result in undefined behaviour" seems simply wrong. – Remember Monica Sep 24 '22 at 16:46
  • @RememberMonica: The term "Undefined Behavior" was intended to categorize situations where the Standard *waives jurisdiction*, so as to avoid having to make any judgment about what constructs should be usable within non-portable programs. Viewed in this light, the fact that the Standard classifies so many things as UB is perfectly reasonable. Unfortunately, some people insist that rather than waiving jurisdiction, the phrase is intended to forbid the use of such constructs even within non-portable programs. – supercat Aug 03 '23 at 17:45
31

From the official C Rationale Document

The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.

Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.

Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.

Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.

Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • 4
    Hyper-modern compiler writers also regard "undefined behavior" as giving compiler writers license to assume that programs will never receive inputs that would cause Undefined Behavior, and to arbitrarily change all aspects of how the programs behave when they receive such inputs. – supercat May 25 '16 at 22:30
  • 2
    Another point I just noticed: C89 did not use the term "extension" to describe features that were guaranteed on some implementations but not others. The authors of C89 recognized that the majority of then-current implementations would treat signed arithmetic and unsigned arithmetic identically except when the results were used in certain ways, and such treatment applied even in case of signed overflow; they did not list that as a common extention in Annex J2, however, which suggests to me they viewed it as a natural state of affairs, rather than an extension. – supercat Aug 19 '16 at 15:49
14

Undefined Behavior vs. Unspecified Behavior has a short description of it.

Their final summary:

To sum up, unspecified behavior is usually something you shouldn't worry about, unless your software is required to be portable. Conversely, undefined behavior is always undesirable and should never occur.

Anders Abel
  • 67,989
  • 17
  • 150
  • 217
  • 1
    There are two kinds of compilers: those which, unless explicitly documented otherwise, interpret most of the Standard's forms of Undefined Behavior as falling back upon characteristic behaviors documented by the underlying environment, and those which by default only usefully expose behaviors which the Standard characterizes as Implementation-Defined. When using compilers of the first type, many things of the first type can be done efficiently and safely using UB. Compilers for the second type will only be suitable for such tasks if they provide options to guarantee behavior in such cases. – supercat Jul 05 '17 at 15:23
12

Implementation defined-

Implementors wish,should be well documented,standard gives choices but sure to compile

Unspecified -

Same as implementation-defined but not documented

Undefined-

Anything might happen,take care of it.

Suraj K Thomas
  • 5,773
  • 4
  • 52
  • 64
  • 3
    I think it's important to note that the practical meaning of "undefined" has changed over the last few years. It used to be that given `uint32_t s;`, evaluating `1u< – supercat Apr 16 '15 at 18:25
10

Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.

The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.

Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.

For example, given the following code:

int scaled_velocity(int v, unsigned char pow)
{
  if (v > 250)
    v = 250;
  if (v < -250)
    v = -250;
  return v << pow;
}

a two's-complement implementation would not have to expend any effort whatsoever to treat the expression v << pow as a two's-complement shift without regard for whether v was positive or negative.

The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • But handling undefined behavior in a nice way doesn't come for free. The whole reason that modern compilers exhibit such bizarre behavior in some cases of UB is that they are relentlessly optimizing, and to do the best job at that, they have to be able to assume that UB never occurs. – Tom Swirly May 25 '16 at 16:24
  • 1
    But the fact that `<<` is UB on negative numbers is a nasty little trap and I'm glad to be reminded of that! – Tom Swirly May 25 '16 at 16:27
  • 1
    @TomSwirly: Unfortunately, compiler writers don't care that offering loose behavioral guarantees beyond those mandated by the Standard can often allow a massive speed boost compared with requiring that code avoid at all costs anything not defined by the Standard. If a programmer doesn't care whether `i+j>k` yields 1 or 0 in cases where the addition overflows, *provided it has no other side effects*, a compiler may be able to make some massive optimizations that would not be possible if the programmer wrote the code as `(int)((unsigned)i+j) > k`. – supercat May 25 '16 at 17:48
  • 1
    @TomSwirly: To them, if compiler X can take a strictly-conforming program to do some task T and yield an executable that is 5% more efficient than compiler Y would yield with that same program, that means X is better, even if Y could generate code that did the same task three times as efficiently given a program that exploits behaviors that Y guarantees but X does not. – supercat May 25 '16 at 18:01
  • @supercat If I may ask, what is the optimization applicable to `i+j>k` in such a case? – Petr Skocik Jan 14 '22 at 21:15
  • 1
    @PSkocik: Consider as a simple scenario a situation where a `i`, `j`, and `k` are arguments to a function a compiler is expanding in line for a function call `foo(x, y, x)`. In that scenario, a compiler could replace `i+j > k` with `x+y > x`, which it could in turn replaced with `y > 0`, skipping the addition entirely, eliminating any dependency on the value of `x`, and possibly allowing a compiler to eliminate the comparison and any dependency upon the exact value of `y` if it can determine that `y` will always be positive. – supercat Jan 14 '22 at 21:23
  • Thanks. That's an interesting take I haven't seen before: Optimize as if signed overflow didn't happen but don't make the allow the program to become malformed if you do see it happen. (BTW, I think it's a better example than the one in your answer (`v << pow` can be losslesly rewritten into `(int)((unsigned)v< – Petr Skocik Jan 14 '22 at 22:21
  • @PSkocik: I was feeling a bit snarky when I wrote that answer, but regard as obtuse and dangerous the notion that the Standard's failure to mandate behavior in a particular situation is an invitation for compilers to assume that nothing they might do would be viewed by their users as unacceptable. The Standard deliberately gives implementations which are intended for specialized tasks great latitude to behave in ways that would make them unsuitable for most others; the fact that the Standard allows implementations to do something does not imply any judgement that any implementations should. – supercat Jan 14 '22 at 22:49
  • @PSkocik: As for `v << pow`, should an implementation be allowed to, or forbidden from, replacing e.g. `(v << pow) > 0` with `v > 0` if `v` is signed? If the programmer had cast `v` to unsigned, such substitution would be forbidden, but all cases where the former expression's behavior wouldn't match the latter are regarded as UB under C99 and later. I think that allowing programmers to write `v << pow` in cases where C89 would define the behavior but the substitution would be useful, would make more sense than requiring them to block the optimization by using the unsigned cast. – supercat Jan 14 '22 at 22:57
6

C++ standard n3337 § 1.3.10 implementation-defined behavior

behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents

Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.


C++ standard n3337 § 1.3.24 undefined behavior

behavior for which this International Standard imposes no requirements [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. — end note ]

When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).


C++ standard n3337 § 1.3.25 unspecified behavior

behavior, for a well-formed program construct and correct data, that depends on the implementation [ Note: The implementation is not required to document which behavior occurs. The range of possible behaviors is usually delineated by this International Standard. — end note ]

C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well-defined behavior has to be chosen ( but not necessarily described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult for the user to know exactly how the program will behave.

starriet
  • 2,565
  • 22
  • 23
4pie0
  • 29,204
  • 9
  • 82
  • 118
1

Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".

Good: a program that compiles and works, for the right reasons.

Bad: a program that has an error, of a kind that the compiler can detect and complain about.

Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.

Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Constructs which the Standard characterizes as Undefined Behavior are "non-portable *or* erroneous", but the Standard makes no attempt to distinguish those which are erroneous from those which are non-portable *but correct* when processed by the implementations for which they were written or others that are compatible with them. – supercat Jun 16 '21 at 22:42
  • 1
    To be fair, it's often very possible to warn for undefined behavior. – klutt Aug 21 '22 at 18:52