X86 Serializing Instructions

-->

This article contains a categorical list of compiler options. For an alphabetical list, see Compiler options listed alphabetically.

Optimization

OptionPurpose
/O1Creates small code.
/O2Creates fast code.
/ObControls inline expansion.
/OdDisables optimization.
/OgDeprecated. Uses global optimizations.
/OiGenerates intrinsic functions.
/OsFavors small code.
/OtFavors fast code.
/OxA subset of /O2 that doesn't include /GF or /Gy.
/OyOmits frame pointer. (x86 only)
/favorProduces code that is optimized for a specified architecture, or for a range of architectures.

Code generation

Serializing instructions The following exception entry instructions are serializing:SVC SMC BKPT instructions that take the prefetch abort handler. Instructions that take the Undefined instruction exception handler, The following instructions that modify mode or program control are serializing: MSR. The source operand is a byte memory location. If the line selected is already present in the lowest level cache and is already in an exclusively owned state, no data movement occurs. Prefetches from non-writeback memory are ignored. The PREFETCHW instruction is merely a hint and does not affect program behavior.

OptionPurpose
/archUse SSE or SSE2 instructions in code generation. (x86 only)
/clrProduces an output file to run on the common language runtime.
/EHSpecifies the model of exception handling.
/fpSpecifies floating-point behavior.
/GAOptimizes for Windows applications.
/GdUses the __cdecl calling convention. (x86 only)
/GeDeprecated. Activates stack probes.
/GFEnables string pooling.
/GhCalls hook function _penter.
/GHCalls hook function _pexit.
/GLEnables whole program optimization.
/GmDeprecated. Enables minimal rebuild.
/GREnables run-time type information (RTTI).
/GrUses the __fastcall calling convention. (x86 only)
/GSChecks buffer security.
/GsControls stack probes.
/GTSupports fiber safety for data allocated by using static thread-local storage.
/guard:cfAdds control flow guard security checks.
/GvUses the __vectorcall calling convention. (x86 and x64 only)
/GwEnables whole-program global data optimization.
/GXDeprecated. Enables synchronous exception handling. Use /EH instead.
/GyEnables function-level linking.
/GZDeprecated. Enables fast checks. (Same as /RTC1)
/GzUses the __stdcall calling convention. (x86 only)
/homeparamsForces parameters passed in registers to be written to their locations on the stack upon function entry. This compiler option is only for the x64 compilers (native and cross compile).
/hotpatchCreates a hotpatchable image.
/Qfast_transcendentalsGenerates fast transcendentals.
/QIfistDeprecated. Suppresses the call of the helper function _ftol when a conversion from a floating-point type to an integral type is required. (x86 only)
/Qimprecise_fwaitsRemoves fwait commands inside try blocks.
/QIntel-jcc-erratumMitigates the performance impact of the Intel JCC erratum microcode update.
/QparEnables automatic parallelization of loops.
/Qpar-reportEnables reporting levels for automatic parallelization.
/Qsafe_fp_loadsUses integer move instructions for floating-point values and disables certain floating point load optimizations.
/QspectreEnable mitigations for CVE 2017-5753, for a class of Spectre attacks.
/Qspectre-loadGenerate serializing instructions for every load instruction.
/Qspectre-load-cfGenerate serializing instructions for every control flow instruction that loads memory.
/Qvec-reportEnables reporting levels for automatic vectorization.
/RTCEnables run-time error checking.
/volatileSelects how the volatile keyword is interpreted.

Output files

OptionPurpose
/docProcesses documentation comments to an XML file.
/FAConfigures an assembly listing file.
/FaCreates an assembly listing file.
/FdRenames program database file.
/FeRenames the executable file.
/FiSpecifies the preprocessed output file name.
/FmCreates a mapfile.
/FoCreates an object file.
/FpSpecifies a precompiled header file name.
/FR, /FrName generated .sbr browser files.

Preprocessor

OptionPurpose
/AISpecifies a directory to search to resolve file references passed to the #using directive.
/CPreserves comments during preprocessing.
/DDefines constants and macros.
/ECopies preprocessor output to standard output.
/EPCopies preprocessor output to standard output.
/FIPreprocesses the specified include file.
/FUForces the use of a file name, as if it had been passed to the #using directive.
/FxMerges injected code with the source file.
/ISearches a directory for include files.
/PWrites preprocessor output to a file.
/URemoves a predefined macro.
/uRemoves all predefined macros.
/XIgnores the standard include directory.

Language

OptionPurpose
/constexprControl constexpr evaluation at compile time.
/openmpEnables #pragma omp in source code.
/vdSuppresses or enables hidden vtordisp class members.
/vmbUses best base for pointers to members.
/vmgUses full generality for pointers to members.
/vmmDeclares multiple inheritance.
/vmsDeclares single inheritance.
/vmvDeclares virtual inheritance.
/Z7Generates C 7.0-compatible debugging information.
/ZaDisables C89 language extensions.
/ZcSpecifies standard behavior under /Ze.
/ZeDeprecated. Enables C89 language extensions.
/ZfImproves PDB generation time in parallel builds.
/ZHSpecifies MD5, SHA-1, or SHA-256 for checksums in debug info.
/ZIIncludes debug information in a program database compatible with Edit and Continue. (x86 only)
/ZiGenerates complete debugging information.
/ZlRemoves the default library name from the .obj file.
/ZpnPacks structure members.
/ZsChecks syntax only.
/ZWProduces an output file to run on the Windows Runtime.

Linking

OptionPurpose
/FSets stack size.
/LDCreates a dynamic-link library.
/LDdCreates a debug dynamic-link library.
/linkPasses the specified option to LINK.
/LNCreates an MSIL module.
/MDCompiles to create a multithreaded DLL, by using MSVCRT.lib.
/MDdCompiles to create a debug multithreaded DLL, by using MSVCRTD.lib.
/MTCompiles to create a multithreaded executable file, by using LIBCMT.lib.
/MTdCompiles to create a debug multithreaded executable file, by using LIBCMTD.lib.

Miscellaneous

OptionPurpose
/?Lists the compiler options.
@Specifies a response file.
/analyzeEnables code analysis.
/bigobjIncreases the number of addressable sections in an .obj file.
/cCompiles without linking.
/cgthreadsSpecifies number of cl.exe threads to use for optimization and code generation.
/errorReportDeprecated. Error reporting is controlled by Windows Error Reporting (WER) settings.
/FCDisplays the full path of source code files passed to cl.exe in diagnostic text.
/FSForces writes to the PDB file to be serialized through MSPDBSRV.EXE.
/HDeprecated. Restricts the length of external (public) names.
/HELPLists the compiler options.
/JChanges the default char type.
/JMCSupports native C++ Just My Code debugging.
/kernelThe compiler and linker will create a binary that can be executed in the Windows kernel.
/MPBuilds multiple source files concurrently.
/nologoSuppresses display of sign-on banner.
/sdlEnables additional security features and warnings.
/showIncludesDisplays a list of all include files during compilation.
/TcSpecifies a C source file.
/TCSpecifies all source files are C.
/TpSpecifies a C++ source file.
/TPSpecifies all source files are C++.
/VDeprecated. Sets the version string.
/wDisables all warnings.
/W0, /W1, /W2, /W3, /W4Sets output warning level.
/w1, /w2, /w3, /w4Sets warning level for the specified warning.
/WallEnables all warnings, including warnings that are disabled by default.
/wdDisables the specified warning.
/weTreats the specified warning as an error.
/WLEnables one-line diagnostics for error and warning messages when compiling C++ source code from the command line.
/woDisplays the specified warning only once.
/WvDisables warnings introduced by later versions of the compiler.
/WXTreats warnings as errors.
/YcCreate .PCH file.
/YdDeprecated. Places complete debugging information in all object files. Use /Zi instead.
/YlInjects a PCH reference when creating a debug library.
/YuUses a precompiled header file during build.
/Y-Ignores all other precompiled-header compiler options in the current build.
/ZmSpecifies the precompiled header memory allocation limit.
/awaitEnable coroutines (resumable functions) extensions.
/source-charsetSet source character set.
/execution-charsetSet execution character set.
/utf-8Set source and execution character sets to UTF-8.
/validate-charsetValidate UTF-8 files for only compatible characters.
/diagnosticsControls the format of diagnostic messages.
/permissive-Set standard-conformance mode.
/stdC++ standard version compatibility selector.

Experimental options

Experimental options may only be supported by certain versions of the compiler. They may also behave differently in different compiler versions. Often the best, or only, documentation for experimental options is in the Microsoft C++ Team Blog.

OptionPurpose
/experimental:moduleEnables experimental module support.
/experimental:preprocessorEnables experimental conforming preprocessor support.

Deprecated and removed compiler options

OptionPurpose
/clr:noAssemblyDeprecated. Use /LN (Create MSIL Module) instead.
/errorReportDeprecated. Error reporting is controlled by Windows Error Reporting (WER) settings.
/FrDeprecated. Creates a browse information file without local variables.
/GeDeprecated. Activates stack probes. On by default.
/GmDeprecated. Enables minimal rebuild.
/GXDeprecated. Enables synchronous exception handling. Use /EH instead.
/GZDeprecated. Enables fast checks. Use /RTC1 instead.
/HDeprecated. Restricts the length of external (public) names.
/OgDeprecated. Uses global optimizations.
/QIfistDeprecated. Once used to specify how to convert from a floating-point type to an integral type.
/VDeprecated. Sets the .obj file version string.
/Wp64Obsolete. Detects 64-bit portability problems.
/YdDeprecated. Places complete debugging information in all object files. Use /Zi instead.
/Zc:forScope-Deprecated. Disables conformance in for loop scope.
/ZeDeprecated. Enables language extensions.
/ZgRemoved in Visual Studio 2015. Generates function prototypes.

See also

C/C++ building reference
MSVC compiler options
MSVC compiler command-line syntax

(Redirected from Rdtsc)

The Time Stamp Counter (TSC) is a 64-bit register present on all x86 processors since the Pentium. It counts the number of cycles since reset. The instruction RDTSC returns the TSC in EDX:EAX. In x86-64 mode, RDTSC also clears the higher 32 bits of RAX and RDX. Its opcode is 0F 31.[1]Pentium competitors such as the Cyrix6x86 did not always have a TSC and may consider RDTSC an illegal instruction. Cyrix included a Time Stamp Counter in their MII.

Use[edit]

The Time Stamp Counter was once an excellent high-resolution, low-overhead way for a program to get CPU timing information. With the advent of multi-core/hyper-threaded CPUs, systems with multiple CPUs, and hibernatingoperating systems, the TSC cannot be relied upon to provide accurate results — unless great care is taken to correct the possible flaws: rate of tick and whether all cores (processors) have identical values in their time-keeping registers. There is no promise that the timestamp counters of multiple CPUs on a single motherboard will be synchronized. Therefore, a program can get reliable results only by limiting itself to run on one specific CPU. Even then, the CPU speed may change because of power-saving measures taken by the OS or BIOS, or the system may be hibernated and later resumed, resetting the TSC. In those latter cases, to stay relevant, the program must re-calibrate the counter periodically.

Relying on the TSC also reduces portability, as other processors may not have a similar feature. Recent Intel processors include a constant rate TSC (identified by the kern.timecounter.invariant_tsc sysctl on FreeBSD or by the 'constant_tsc' flag in Linux's /proc/cpuinfo). With these processors, the TSC ticks at the processor's nominal frequency, regardless of the actual CPU clock frequency due to turbo or power saving states. Hence TSC ticks are counting the passage of time, not the number of CPU clock cycles elapsed.

On Windows platforms, Microsoft strongly discourages using the TSC for high-resolution timing for exactly these reasons, providing instead the Windows APIsQueryPerformanceCounter and QueryPerformanceFrequency.[2] On POSIX systems, a program can get similar function by reading the value of CLOCK_MONOTONIC clock using the clock_gettime function.[3]

Starting with the Pentium Pro, Intel processors have practiced out-of-order execution, where instructions are not necessarily performed in the order they appear in the program. This can cause the processor to execute RDTSC earlier than a simple program expects, producing a misleading cycle count.[4] The programmer can solve this problem by inserting a serializing instruction, such as CPUID, to force every preceding instruction to complete before allowing the program to continue, or by using the RDTSCP instruction, which is a serializing variant of the RDTSCBest price for nvidia quadro 4000 for mac pro. instruction.

Implementation in various processors[edit]

X86 Serializing Instructions

Intel processor families increment the time-stamp counter differently:[5]

  • For Pentium M processors (family [06H], models [09H, 0DH]); for Pentium 4 processors, Intel Xeon processors (family [0FH], models [00H, 01H, or 02H]); and for P6 family processors: the time-stamp counter increments with every internal processor clock cycle. The internal processor clock cycle is determined by the current core-clock to busclock ratio. Intel SpeedStep technology transitions may also impact the processor clock.
  • For Pentium 4 processors, Intel Xeon processors (family [0FH], models [03H and higher]); for Intel Core Solo and Intel Core Duo processors (family [06H], model [0EH]); for the Intel Xeon processor 5100 series and Intel Core 2 Duo processors (family [06H], model [0FH]); for Intel Core 2 and Intel Xeon processors (family [06H], display_model [17H]); for Intel Atom processors (family [06H], display_model [1CH]): the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the maximum qualified frequency of the processor.

The specific processor configuration determines the behavior. Constant TSC behavior ensures that the duration of each clock tick is uniform and makes it possible to use of the TSC as a wall clock timer even if the processor core changes frequency. This is the architectural behavior for all later Intel processors.

AMD processors up to the K8 core always incremented the time-stamp counter every clock cycle.[6] Thus, power management features were able to change the number of increments per second, and the values could get out of sync between different cores or processors in the same system. For Windows, AMD provides a utility[7] to periodically synchronize the counters on multiple core CPUs.Since the family 10h (Barcelona/Phenom), AMD chips feature a constant TSC, which can be driven either by the HyperTransport speed or the highest P state. A CPUID bit (Fn8000_0007:EDX_8) advertises this; Intel-CPUs also report their invariant TSC on that bit.

Operating system use[edit]

An operating system may provide methods that both use and don't use the RDTSCSidify music converter for spotify crack mac. instruction for time keeping, under administrator control. For example, on some versions of the Linux kernel, seccomp sandboxing mode disables RDTSC.[8] It can also be disabled using the PR_SET_TSC argument to the prctl() system call.[9]

Use in exploiting cache side-channel attacks[edit]

The time stamp counter can be used to time instructions accurately which can be exploited in the Meltdown and Spectre security vulnerabilities[10][11]. However if this is not available other counters or timers can be used, as is the case with the ARM processors vulnerable to this type of attack.

Other architectures[edit]

Other processors also have registers which count CPU clock cycles, but with different names. For instance, on the AVR32, it is called the Performance Clock Counter (PCCNT) register. SPARC V9 provides theTICK register.ARMv7[12] andARMv8[13] architectures provide a genericcounter which counts at a constant frequency. PowerPC provides the 64-bit TBR register.

See also[edit]

  • High Precision Event Timer (HPET)

References[edit]

  1. ^Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2B: Instruction Set Reference, M-Z(PDF). p. 460.
  2. ^Game Timing and Multicore Processors. pp. 251–252.
  3. ^'clock_getres, clock_gettime, clock_settime - clock and timer functions'.
  4. ^'Using the RDTSC Instruction for Performance Monitoring'(PDF).
  5. ^'Volume 3A, Chapter 16'. Intel 64 and IA-32 Architectures Software Developer's Manual.
  6. ^'Volume 3'. AMD64 Architecture Programmer's Manual.
  7. ^'AMD Dual-Core Optimizer'.
  8. ^'cr0 blog: Time-stamp counter disabling oddities in the Linux kernel'. May 2009.
  9. ^prctl(2) – Linux Programmer's Manual – System Calls
  10. ^'meltdown.c'.
  11. ^'spectre.c'.
  12. ^'ARMv7 reference manual'.
  13. ^'ARMv8 reference manual'.

External links[edit]

  • cycle.h - C code to read the high-resolution timer on many CPUs and compilers.
  • [1] - Very simple C code to read the timer on an x86 machine. This reads the 64-bit value into two 32-bit integers and combines them - using just one 64-bit integer is another option.[clarification needed]
Retrieved from 'https://en.wikipedia.org/w/index.php?title=Time_Stamp_Counter&oldid=934507698'