As in the OpenCL post, the default samples that are shipped with the CUDA SDK are a big mess of complicated. (Although some of the online resources are better). As such here is a minimal implementation of the same simple setup of the most basic things in GPGPU.
CUDA is a little different than OpenCL. In C++ if you aren't separately compiling and linking, it is written like it is part of the C++ language and those parts of the code are compiled with NVCC.
If you want to build a CUDA application in Visual Studio the easiest way is to create a new project from the Visual Studio home screen and select NVIDIA CUDA, alternatively you can switch the compilation for each cpp file in your solution browser to use the NVIDIA compiler. This should all be available if you installed the CUDA Toolkit (SDK)
With all that being said, here is the simple demo doing the same as the OpenCL demo. That is: Initialise the device, allocate some memory, run a kernel to fill that memory with 42, finish running the kernel, copy the data back to the host CPU and check it is valid.