A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces (Sutton and Barto, 2018). However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics (OpenAI et al., 2019), typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment (Garcıa and Fernández, 2015). This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym (Ray et al., 2019), and robotic tasks widely adopted in the literature (e.g., manipulation (Gu et al., 2017), navigation (Tai et al., 2017)). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches (Henderson et al., 2018). We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of us- ing non-standard simulation environments (i.e., Unity (Juliani et al., 2018)), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) (Liu et al., 2019) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) (Fogel, 2006) as a gradient- free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration (Rashid et al., 2018), proposing an architecture that favors cooperation without affecting exploration.